Reading genomes, bit by bit

Because of rapid advances in genome sequencing technology, we can finally see the source code for life: the complete genomic DNA sequences that specify development, regulation, and function of organisms. Our laboratory develops computational methods for genome sequence analysis. We are particularly interested in methods for identifying remote evolutionary relationships between distantly related protein and RNA sequences.

We're a Howard Hughes Medical Institute laboratory at Harvard University, in Molecular & Cellular Biology in Biolabs on the Cambridge campus. We're affiliated with the Harvard Data Science Initiative and the Center for Brain Science.

Recent publications

How to join us

Graduate students in the group come from many different Harvard programs, including Systems, Synthetic, and Quantitative Biology, MCO (Molecules, Cells, and Organisms), and Biophysics. Most programs are rotation-based. In your first year, you do about three short research projects in labs you might be interested in (like ours!).

If you're interested in doing a postdoc with us - and if you're planning to obtain your own independent funding - contact me and propose a project idea. Unfortunately, because the US government has frozen federal funding to Harvard University, we do not have available funding for new postdocs at this time.

How to find us

Department of Molecular & Cellular Biology
Biological Laboratories 1008A
16 Divinity Avenue
Harvard University
Cambridge MA 02138, USA

HMMER

Hidden Markov models for sequence profile analysis.

Infernal

RNA structure analysis using covariance models.

Pfam

Database of protein family alignments and hidden Markov models.

Rfam

The Rfam database of RNA alignments, consensus secondary structures, and profile SCFGs.

Dfam

The Dfam database of repetitive DNA sequence elements.