Skip to main content

Data Types

  • Unaligned Reads: reads that are not aligned to the reference genomes (raw data generated by the sequencers).
  • Aligned Reads: reads that are already aligned to the reference genomes.
  • Gene Level Copy Number: A copy number variation (CNV) is when the number of copies of a particular gene varies from one individual to the next.
  • Simple Germline Variant: A gene change in a reproductive cell (egg or sperm) that becomes incorporated into the DNA of every cell in the body of the offspring. A variant contained within the germline can be passed from parent to offspring, and is, therefore, hereditary.

File Format#

  • FASTQ: a text format that represents the sequence data from the clusters that pass filter on a flow cell.
  • BAM: a binary version of SAM format. A SAM file (.sam) is a tab-delimited text file that contains sequence alignment data.
  • VCF: a text format used for storing gene sequence variations.