tRNAscan-SE - Release history
=============================

Version 0.90 (March 6, 1996)
------------
Initial beta release


Version 0.91 (March 12, 1996)
------------
-MAJOR BUG: covels & coves executables not called correctly from
program, will result in detection of NO tRNAs in default mode!
-added better checking for all required executables before starting
analysis
-added "#include <stdlib.h>" to trnascan.c, DEC Alpha's were 
complaining with use of unprototyped 'calloc' call
-added 'make testrun' option to Makefile to allow user 
to check to see if tRNAscan-SE is running properly


Version 0.91a (March 13, 1996)
-------------
-two minor updates of Sean's old sqio library for SunOS 4.1 
compatibility:
   - fflush(NULL) fails in SunOS; replace with fflush(stderr)
   - SEEK_SET is in unistd.h in SunOS


Version 0.92 (March 14, 1996)
------------
-updated all sqio source files to squid-1.5g to avoid any other
problems already fixed by Sean in current sqio release
-updated Cove source files to work with updated verison of squid, new
version of Cove is "2.4.2a"
-added $PERLBIN variable in Makefile to allow installation on systems
that don't call perl v.5 binary 'perl' (i.e. 'perl5' instead)
-included source for 'reformat' and 'getseq' programs from Sean's sqio
function library; not built automatically, but available for users who
want to compile and use them
-with new sqio version, don't need to specify any MDEFS in Makefile
(ie. -DNORANDOM or -DGETOPTH) unless problems with compilation
 
Version 0.92a (March 18, 1996)
-------------
-fixed SEEK_SET problem in interleaved.c by adding #include <unistd.h>
(interleaved.c was a recent addition to squid library), needed for
SunOS 4.x compilation

Version 0.92b (April 4, 1996)
-------------
-added -X option to allow specification of Cove cutoff score for
reporting tRNAs
-fixed -u (use previous results file) option to allow using a regular
tabular output file; this allows easy generation of ACeDB or secondary
structure output files without having to re-run on entire sequence(s)
-changed the default name for the secondary structure file from
".allstruct" to ".ss" for simplicity

Version 0.92c (April 10, 1996)
-------------
-at Christoph's suggestion, added more info to structure output file
(-f option) to display relative & absolute location of anticodon
-also, slightly changed rest of output format, the following is an
example of the new format:

CELF22B7.trna3 (26367-26439)    Length: 73 bp
Type: Phe       Anticodon: GAA at 34-36 (26400-26402)   Score: 73.88
         *    |    *    |    *    |    *    |    *    |    *    |    *    |  
Seq: GCCTCGATAGCTCAGTTGGGAGAGCGTACGACTGAAGATCGTAAGGtCACCAGTTCGATCCTGGTTCGGGGCA
Str: >>>>>>>..>>>>........<<<<.>>>>>.......<<<<<.....>>>>>.......<<<<<<<<<<<<.

Version 0.93 (April 22, 1996)
-------------
-not released
-Major changes, added my own C implementation of an algorithm that uses
multistep weight matrix analysis detailed by Pavesi et al. (NAR 22:
1247-1256, 1994), dubbed eufindtRNA since it searches for eukaryotic
transcription signals

-new "default" run mode for tRNAscan-SE uses tRNAscan 1.3 "strict"
params, with eufindtRNA in "relaxed" mode
-results in a 40-80 fold speed increase with BETTER sensitivity and
equal selectivity as the previous default mode

-many internal labels changed to reflect that two programs (eufindtRNA
and tRNAscan) are being used as fast first-pass scanners

Minor changes:
-removed -r option (save raw tRNAscan output) since I've never used it
and it was cluttering code
-added -y option (show source of first-pass hit, "Eu" = eufindtRNA,
"Ts" = tRNAscan, "Bo" = both) - prints after "Score" column in tabular
output
-added -F option to save false-positives produced by first-pass
scanners that were later negg'ed by Cove; only for use in studying
behavior of program, not of use to general users
-added message "No tRNAs found." that prints to standard error if no
tRNAs are found in sequence(s)
-restructured -T, -C options to work with new option -E (use
eufindtRNA)

Version 0.93a (May 31, 1996)
-------------
-released to Kym at GSC
-changed ACeDB output format, an example follows:

Sequence        F02D10
Subsequence     F02D10.t3 188 70

Sequence        F02D10.t3
Source          F02D10
Source_Exons    1 38
Source_Exons    74 119
Brief_identification tRNA-Leu
Transcript tRNA "CAA Leu L"
Score tRNAscan-SE-0.93 57.23

-new Ace format adopted by Cambridge and GSC

-extended max length of tRNA to 200bp (roughly 125bp max intron
length)
-restructured available options printout

-alterted EufindtRNA to find a prokaryotic selcys (already finds
eukaryotic selcys tRNAs) based on a consensus sequence

-added PSELC.cm and ESELC.cm covariance models (prokaryotic &
eukaryotic selcys, respect.) to help Cove detect selenocysteine tRNAs
and give correct secondary structures

-changed -t option (save firstpass results) to -r option
-new -t option used to set tRNAscan parameters (R=relaxed, S=strict)

-added -P option, prokaryotic scan mode (loosens EufindtRNA params for
finding prokaryotic tRNAs)

-by default, now adds 7bp to both 5' and 3' ends of tRNA hits from
firstpass scan -- gives Cove a better chance to define entire tRNA in
case first-pass scanners slightly truncate
-added -B option, allows adjusting # bp to pad first-pass hits; can
use a value of zero to turn off bounds padding


Version 0.94 (June 8, 1996)
------------
-tightened up error checking for correct exiting of called programs
-now handles sequence sets with duplicate names correctly

Version 0.95 (June 27, 1996)
------------
-announced to Sanger, GSC, LBL (Nomi), Christoph
-updated all documentation reflecting incorporation of EufindtRNA
into tRNAscan-SE


Version 0.96 (September 1996)
------------
-fixed minor bug causing incorrect intron bounds printed when using
tRNAscan alone (-T option)
-fixed minor bug causing infinite loop if identical sequence names are
used several times in the same input FASTA file
-fixed intron prediction so only non-cannonical nucleotides appearing 
_within_ anticodon loop are predicted as an intron (before, anything
including anticodon loop _and_ after was being called as an intron

-put in filter to detect tRNA-derived repetitive element pseudogenes
(rat ID seqs, rodent B2 elements, type II mammalian ALU seqs) from
real tRNAs
-new filter: for tRNAs with <40 bit scores, if 
  1) contribution to score from secondary structure < 5 bits, or 
  2) contribution to score from primary (HMM-type) pattern < 10 bits

-this filter effectively eliminates all but one Genbank seq (a rat ID
seq, RATRSIDH) and three putative rat ID seqs found in dbEST 
(R46943,R47014,R82886)

-added option requested by GSC, -N outputs codons instead of
anticodons with all tRNA identification output

-slightly modified -F (save false positives) option to output entire
false positive sequence, not just subsequence with score between 0 &
20 bits

Version 0.96a (October 10, 1996)
--------------------------------
-added option (-Z) to run in Cove-only mode on Maspar; uses special
routine that runs covels on entire sequence set, then parses out tRNA
info

Version 1.0 (December 6, 1996)
-------------------------------
-first general release, all features as described in paper just
submitted to Nucleic Acids Research (11/27/96)

-updated to Cove source code to 2.4.4 (from 2.4.2a)

-changed the way eufindtRNA handles ambiguous bases,
now is consistent
-used to randomly choose a base among ambig choices, but not
giving consistent results (no surprise) for Sprinzl search
-so, all non-ACGT bases now are counted as a single type
of ambiguous base, and that base always has the best value of
any of the four ACGT values for that position in the A box
and B box score matrices
-may slightly increase false positive rate, but not a concern for
use with tRNAscan-SE

-added option (-L) allows checking for very long tRNAs (>192bp)
that contain group I, group II, or other long intervening seqs

-modified pseudogene filter, changed requirement that pseudogene must
have a total score "less than 40 bits" TO "less than 55 bits" 
(catches more potential pseudogenes)


Version 1.01 (February 5, 1997)
-------------------------------
-updated reference to paper (just accepted to Nucl. Acids Research)
-slightly simplified stats file output to make less confusing
-no functional changes to program

Version 1.02 (February 18, 1997)
-------------------------------
-added -G option, search for orGanellar tRNAs (mito/chloro)
-updated program manual & man page documentation

Version 1.1 (June 9, 1997) - not released
--------------------------
-option -G (organellar tRNA search mode) changed to option -O
(easier mnemonic)

-split original tRNA covariance model (CM) into three different
models, one for each domain ("TRNA2-euk.cm" for eukyarotes, 
"TRNA2-prok.cm" for prokaryotes, "TRNA2-arch.cm" for archaea)


-now, by default, eukaryotic-specific CM used
    -P uses prokaryotic specific covariance model
    -A uses archaeal model
    -G uses original TRNA2.cm general model (tRNAs from all domains)


-re-organized options in help message (-h option) & updated user
manual

-updated credits to reflect publication of tRNAscan-SE paper
in Nucleic Acids Research 25: 955-964 (1997)


Version 1.11 (November 5, 1997)
-------------------------------

-fixed minor bug that causes division by zero crash when -n or -s
options are used and no sequences are found matching these patterns

Version 1.12 (February 19, 1999)
--------------------------------
-minor fix: when -O (organellar search mode) selected, "Eukaryotic" was
displayed -- changed so it now says "Organellar" 

-minor fix: when both -H (show HMM/2'struct score breakdown) and -D
(disable pseudogene checking) options selected, it _was_ enabling
pseudogene checking;  now both work together as expected

Version 1.13 (May 16, 2000)
---------------------------
-minor option added:  -i will use versions of pre-scanners (tRNAscan-1.4 
and EufindtRNA) that have been compiled to not optimisitically call
ambiguous nucleotides.  For unfinished sequeneces with many 'N's, this
greatly speeds scanning since many fewer false positives are passed on to
Cove.  Must run "make noambig" to produce these binaries 

Version 1.20 (September 14, 2000)
---------------------------------

-MAJOR upgrade feature: The program no longer attempts to read in an
entire sequence into memory at a time.  When scanning the human genome
full chromosome sequences, it was taking >1GB of memory.  Now, the
program only reads sequences in 1 Mbp chunks, and processes them
identically otherwise.  The maximum memory now required to search
sequences of any length should be < 15 MB.

-Fixed minor bug in eufindtRNA (new version 1.1) that caused second of
two consecutive tRNAs (within 40bp) to be missed if the second tRNA
scored lower than the first.  Very few tRNAs are detected by
eufindtRNA and not tRNAscan 1.4, so this bug probably affected only a
couple tRNA detections for all completed genomes

-The default Max tRNA length has been upped from 200 to 500 bp when
using prescanners, and from 150 to 250 when using Cove only.  Also,
the default max intron length for eufindtRNA was upped from 116 to
200.  These increases will slow the program by about 30%, but will
help identify archaeal and bacterial tRNAs with introns of length 200
or less that were just barely being missed by the previous defaults.
Scan time is very short for tRNAscan-SE, so a slight increase was
deemed acceptable.  To reverse these changes in the defaults, use
the parameter setting "-L=116".  

-Fixed a minor bug when using the -L parameter that caused loss of
tRNA detection when both pre-scanners were in use, and identified
the same tRNA with differing A and B boxes.  

-Eliminated old code for running on Maspar machines (unused for
development since 1997)

-Added more complete summary statistics to .stats files.  They now
give breakdowns by isotype & anticodon, with counts for
intron-containing tRNAs, pseudogenes, selenocysteine, and other
non-standard tRNAs.

-Slightly reformated output file (*.out) columns so that for
large sequences, tab columns do not go out of alignment.  This
slightly changes the white spaceing among column headers.


Version 1.21 (October 5, 2000)
------------------------------
-Added automatic option that removes runs of 10 N's or 
more from consideraton by pre-scanners; better solution
thann -i option which disqualifies tRNAs with any N's
at all.  Also better since this can be default behavior,
users don't "need to know" about using the -i option,
whether they have many N's or not.  

-Removed -i option (see above change)

-fixed bug in use of "-O" (organellar scan mode) option
Should work correctly now.  (Fixed bug: thought -O should
take a parameter, so the next parameter on the line was
sucked up, causing unintended run specifications)

-Changed "Prokaryotic mode" to "Bacterial Mode", which is
more accurate (archaeal sequences were _not_ used to train
the old prokaryotic model).  No functional changes to any
of the models or run parameters.  

-Added "-B" option for Bacterial scan mode.  Same as old
"-P" option, but better named now.  "-P" still works
same as before to prevent breaking of programs that
use tRNAscan-SE  (i.e.  -B equals -P) 

-Switched old "-B" option (nucleotides of padding from
pre-scanner hits to Cove scans) to "-z".  Same function.
I doubt if more than 1 other person in the world uses
this option for tweaking the program, so I'm not worried
about breaking other people's pipeline analysis scripts.

-When using the general tRNA covariance model (-G option),
it now says: "General".

Version 1.22 (Not released)
---------------------------
-minor: Added tRNAscan-SE version number to .stats file output

Version 1.23 (Fixed April 24, 2001, not released until April 2002)
------------
-minor: Was not handling 'X' characters gracefully in input
sequences.  Even though not IUPAC, the program now replaces
X's with N's so it doesn't throw errors.