Web document 13-1. Approaches to genomics: five perspectives
As we survey the tree of life, consider these five perspectives:
Approach I: catalog genomic information
Genome size; number of chromosomes; GC content; isochores; number of genes; repetitive DNA; unique features of each genome
Techniques: genomic DNA sequencing; assembly; annotation; intrinsic, extrinsic gene prediction
Approach II: catalog comparative genomic information
Orthologs and paralogs; COGs; lateral gene transfer
Techniques: comparative genomics; whole chromosome, whole genome alignment
Approach III: biological principles; function; mechanisms
of evolution
How genome size is regulated; polyploidization; birth and death of genes; neutral theory of evolution; positive and negative selection; speciation
Techniques: molecular phylogeny; tests of selection; BLAST for evidence of duplication
Approach IV: human disease relevance
Mechanisms by which organisms cause disease, and types of responses
Techniques: various including SNPs; linkage; association; model organisms
Approach V: bioinformatics aspects
Algorithms, databases, websites
For a course I teach, students analyze a genome in depth as
follows.
[1] Select any genome.
[2] Prepare a written document in which you describe it from the five perspectives outlined in the course:
1) Catalog genomic information (genome size; number of chromosomes; GC content; isochores; number of genes; repetitive DNA; unique features)
2) Catalog comparative genomic information (ladder-and-constellation approach;
orthologs and paralogs; COGs; lateral gene transfer)
3) Mechanisms of evolution (how genome size is regulated; polyploidization; birth and death of genes; neutral theory of evolution; positive and negative selection; speciation)
4) Human disease relevance
5) Computational biology aspects (algorithms, databases, websites)
[3] Identify an outstanding research problem and how genomics approaches can be, or are being applied to solve it.
Alternative
Project: analyze a gene in depth
[1] Select a single protein, RNA, or DNA sequence. Unless you have a particular gene of interest, select one that is conserved across the three domains of life. Obtain a large number of homologous sequences (e.g. 100) in the fasta format.
[2] Perform a phylogenetic analysis. If your gene is conserved, use the sequence to make a tree of life. If it is protein-coding, analyze the substitution rate at different codon positions, describe ancestral sequences, provide evidence for neutral evolution or selection, etc.
[3] Describe specific cases in which the gene has duplicated (or been lost) across genomes. Provide evidence for duplication/deletion and date the occurrence(s).
[4] Describe conserved synteny for this gene across multiple genomes. Describe its neighboring genes.
[5] Describe regulatory regions controlling expression of this gene.