Reading genomes, bit by bit

Because of genome sequencing technology, we have arrived at a remarkable time in biology. At last we can see the source code for life: the complete genomic DNA sequences that specify development, regulation, and function of organisms. But we still don't really understand how to read this trove of encoded information, and we don't really understand in any satisfying detail how it evolved.

Our laboratory develops computational methods for genome sequence analysis. We are particularly interested in methods for identifying remote evolutionary relationships between distantly related protein and RNA sequences.

We're a Howard Hughes Medical Institute laboratory at Harvard University, in the Molecular & Cellular Biology Department, with affiliations to Applied Mathematics, to the FAS Center for Systems Biology, and to the Center for Brain Science.

Recent publications

How to reach us

Department of Molecular & Cellular Biology
Harvard University
Northwest Building, Room 429
52 Oxford Street
Cambridge MA 02138, USA


Hidden Markov models for sequence profile analysis.


RNA structure analysis using covariance models.



Database of protein family alignments and hidden Markov models.


The Rfam database of RNA alignments, consensus secondary structures, and profile SCFGs.


The Dfam database of repetitive DNA sequence elements.