Bioinformatics and Functional Genomics 3rd Edition

Chapter 9: DNA Analysis: Microarrays and Next-generation Sequencing

Next-generation sequencing (NGS) is transforming all areas of biomedical research and it's also having a major impact on clinical diagnosis, agriculture, evolutionary studies, and other domains. This chapter provides a (gentle) introduction to this field. Sequencing technology allows large amounts of data to be produced (typically we need about 1 terabyte of storage to handle the raw and processed data associated with a single human genome). This large-scale data analysis is best handled using Unix or Unix-like operating systems (including the Mac terminal). And so most software tools for NGS analysis are written for the Linux platform. We'll explore a workflow that extends across the following topics:

  • An introduction to sequencing technology
  • A description of FASTQ files (and how to find them); these contain the sequence reads and quality scores
  • Alignment of DNA reads to a reference genome
  • Calling variants
  • Interpreting the significance of variants