HMMER website

Lead author: Sean Eddy

Profile hidden Markov models for biological sequence analysis.

Infernal website

Lead author: Eric Nawrocki

RNA structure analysis using covariance models.

tRNAscan-SE download

Author: Todd Lowe

tRNA detection in large-scale genome sequence.

tRNAscan-SE detects ~99% of eukaryotic nuclear or prokaryotic tRNA genes, with a false positive rate of less than one per 15 gigabases, and with a search speed of about 30 kb/second. It was implemented for large-scale human genome sequence analysis, but is applicable to other DNAs as well. It applies our COVE software (see below) with a carefully built tRNA covariance model, while getting around COVE's speed limitations by using two tRNA finding programs from other research groups as fast first-pass scanners (Fichant and Burks', and an implementation of an algorithm from A. Pavesi's group). It runs on any UNIX system with Perl and a C compiler installed.

SSU-ALIGN website

Author: Eric Nawrocki

Structural alignment of SSU rRNA sequences.

R-scape website

Author: Elena Rivas

RNA Structural Covariation Above Phylogenetic Expectation: Analysis of covariation support for RNA conserved secondary structure in a multiple sequence alignment.

RECON website

Author: Zhirong Bao

Automated identification of repeat sequence families in genome sequences.

TORNADO download

Author: Elena Rivas

A software tool for prototyping single-sequence RNA secondary structure prediction models. Tornado implements a "super-grammar" that includes the standard thermodynamic model as a special case. It can be used to build simpler or more complex models with fewer or more parameters, and it can be used to compare thermodynamic, probabilistic, and discriminative parameterization approaches. This is the maintained (up-to-date) version of the software that accompanied Elena's paper "A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more."

CONUS website

Author: Robin Dowell

Software for the exploration of lightweight stochastic context free grammars

This is the code accompanying Robin Dowell's paper "Evaluation of Several Lightweight Stochastic Context-Free Grammars for RNA Secondary Structure Prediction", BMC Bioinformatics 5:71, 2004. It implements several small SCFGs for single-sequence RNA secondary structure prediction.

QRNA download

Author: Elena Rivas

A prototype noncoding RNA genefinder, based on comparative genome sequence analysis.

This is the code from Elena Rivas that accompanies the paper Noncoding RNA gene detection using comparative sequence analysis. QRNA uses comparative genome sequence analysis to detect conserved RNA secondary structures, including both ncRNA genes and cis-regulatory RNA structures.

RNABOB download

Author: Sean Eddy

Fast Pattern searching for RNA secondary structures.

RNABOB is an implementation of D. Gautheret's RNAMOT, but with a different underlying algorithm using a nondeterministic finite state machine with node rewriting rules. (Computer scientists would probably cringe in horror. It works, and it's fast, but is it street legal in a computer science department? Who knows.) An RNABOB motif is a consensus pattern a la PROSITE patterns, but with base-pairing.

PKNOTS download

Author: Elena Rivas

Experimental code demonstrating a dynamic programming algorithm for RNA pseudoknot prediction.

This is experimental code from Elena Rivas, demonstrating a dynamic programming algorithm for globally optimal RNA pseudoknot prediction. The algorithm is discussed in the paper A dynamic programming algorithm for RNA structure prediction using pseudoknots.

ERATE download

Author: Elena Rivas

Maximum likelihood phylogenetic inference, including insertions/deletions.

erate is an extension of Joe Felsenstein's DNAML program which treats insertions and deletions as evolutionary events, rather than ignoring them as missing data (which is what the most widely used phylogenetic inference programs all do). This is the software that accompanied Elena's paper "Probabilistic Phylogenetic Inference with Insertions and Deletions."

Archaeopteryx (formerly ATV) website

Author: Christian Zmasek

A visualization tool for large phylogenetic trees.

RSEARCH download

Author: Robbie Klein

Sequence database searching with RNA structure queries.

RSEARCH aligns an RNA query to target sequences, using SCFG algorithms to score both secondary structure and primary sequence alignment simultaneously. It's slow, but somewhat more capable of finding significant remote RNA structure homologies than sequence alignment methods like BLAST. (By slow, we mean, you really need a substantial computing cluster to do any serious work with it; a typical single search of a metazoan genome may take a few thousand CPU hours).

CONSAN website

Author: Robin Dowell

Pairwise structural RNA alignment

This is the code accompanying Robin Dowell's paper "Efficient Pairwise RNA Structure Prediction and Alignment Using Sequence Alignment Constraints" , BMC Bioinformatics, 7:400, 2006. It implements a pinned Sankoff algorithm for simultaneous pairwise RNA alignment and consensus structure prediction.

FORESTER website

Author: Christian Zmasek

Inference of orthology and paralogy relationships in gene trees.

SQUID download

Author: Sean Eddy

A C library that is bundled with much of the above software. C function library for sequence analysis.

SQUID is my own personal library of C functions and utility programs for sequence analysis. I don't really suggest that you use it in your programs, as I change it at will. However, it does contains some small utility programs that some people have found useful in scripts that drive large HMMER tasks.

NCRNASCAN download

Author: Elena Rivas

Experimental code for a structural RNA genefinder: it doesn't actually work well, because it turns out that structural RNAs don't have much more secondary structure content than random sequence.
This is the code from Elena Rivas that goes with the paper Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs by Elena Rivas and Sean Eddy. As the title indicates, the genefinder doesn't work (though we still think the algorithm is cool), because real RNAs don't generally have any more secondary structure content than random sequence, contrary to what we expected. The code will only be of interest to people trying to reproduce our negative results, or trying to understand the genome-scanning SCFG alignment algorithm that it implements.

HMMERVIEWER download

Author: Robin Dowell

Robin Dowell's prototype of a Perl/Tk application for viewing profile HMMs created by HMMER, including the Pfam database.

SNOSCAN download

Author: Todd Lowe

Identifies 2'-O-methylation guide snoRNAs in yeast (and possibly other) genome sequences, using a combination of snoRNA sequence/structure consensus and guide complementarity to a putative target rRNA site. See Lowe & Eddy, "A Computational Screen for Methylation Guide snoRNAs In Yeast", Science 283: 1168-1171, 1999.

COVE download

Author: Sean Eddy

Covariance models of RNA secondary structure (old version).

COVE is an implementation of stochastic context free grammar methods for RNA sequence/structure analysis. COVE is still experimental and not as well supported as I would like. It is an extremely sensitive tool for database searching for homologous RNAs, if you have an alignment of an RNA family. It requires hefty CPU resources to run properly.

SWSSE2 download

Author: Michael Farrar

Striped SIMD vectorized Smith/Waterman

This is Michael Farrar's 2006 source code accompanying his paper "Striped Smith-Waterman speeds database searches six times over other SIMD implementations" (Bioinformatics, 2007). Michael died in 2010 while working as a senior software engineer in my laboratory. His 2006 code, developed on his personal time before he joined my group, had only been made available under a non-open-source license. With permission of his wife, who inherited his copyrighted work, I have relicensed his code and released this copy as open source under a BSD license.