AceDB is a genome database designed specifically for handling bioinformatic data flexibly. It includes tools designed to manipulate genomic data, but is increasingly also used for non-biological data.
autoseq is a small package of base calling software for ABI automated DNA sequencers. It is intended as a starting point for researchers interested in new base calling algorithms. It implements a simple Bayesian peak identification algorithm which is experimental and does not generally perform better than software provided by ABI.
Blossoc is a linkage disequilibrium association mapping tool that attempts to build (perfect) genealogies for each site in the input and score these according to non-random clustering of affected individuals, and judge high-scoring areas as likely candidates for containing disease affecting variation.
COVE is an implementation of stochastic context free grammar methods for RNA sequence/structure analysis.
DNA-GUI reads image files holding pictures of DNA gel electrophoreses, assigns lanes, finds bands, and, given the proper standards, processes the pictures and computes sizes for each band identified (by the program or manually).
DNPTrapper is an assembly editing and visualization tool specifically designed for manual analysis and finishing of repeated regions. The program allows the user to view whole repeat regions at once and to edit assembly errors manually by drag and drop. It implements and visualizes the results of a statistical method that detects defined nucleotide positions (DNPs, representing single base differences between repeat units) in the presence of sequencing errors.
Dotter draws a dotplot of two sequences (DNA or Protein) on the screen. It has a 'Greyramp Tool' to interactively adjust the score
cutoffs for displaying dots, so the separation between noise and signal can be fine-tuned interactively. It also features a crosshair,
at which the residue-alignment of the two sequences will be displayed in the 'Alignment tool'.
ESTScan is a program that can detect coding regions in DNA sequences, even if they are of low quality. ESTScan will also detect and correct sequencing errors that lead to frameshifts.
fastDNAml is for estimating maximum likelihood phylogenetic trees from nucleotide sequences.
FASTLINK is a significantly modified and improved version of the main programs of LINKAGE that runs much faster sequentially, can run in parallel, allows the user to recover gracefully from a computer crash, and provides abundant new documentation.
GeneRecon is a software package for linkage disequilibrium mapping using coalescent theory. It is based on a Bayesian Markov-chain Monte Carlo (MCMC) method for fine-scale linkage-disequilibrium gene mapping using high-density marker maps. GeneRecon explicitly models the genealogy of a sample of the case chromosomes in the vicinity of a disease locus. Given case and control data in the form of genotype or haplotype information, it estimates a number of parameters, most importantly, the disease position.
Genomorama is a software program for interactively displaying multiple genomes. It provides a powerful yet easy to use interface that leverages the visualization power of modern computers (via OpenGL) and the substantial bioinformatic infrastructure provided by the NCBI (via the NCBI C toolkit). Genomorama is written in portable, highly optimized C++ and comes in three "flavors" that allow it to run natively on (most) modern operating systems: OS X (using Carbon), Microsoft Windows (using MFC), and Linux (using Motif). Executables and source code are freely provided for all flavors.
HapCluster++ is a software package for linkage disequilibrium mapping using coalescent theory. It is based on a Bayesian Markov-chain Monte Carlo (MCMC) method for fine-scale linkage-disequilibrium gene mapping using high-density marker maps. HapCluster++ is a C++ implementation of the method described in the paper below (the original implementation was in R).
Hidden Markov Model Toolkit
The Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models. HTK is primarily used for speech recognition research, although it has been used for numerous other applications. HTK consists of a set of library modules and tools available in C source form. The tools provide sophisticated facilities for speech analysis, HMM training, testing, and results analysis. The software supports HMMs using both continuous density mixture Gaussians and discrete distributions, and can be used to build complex HMM systems.
Profile hidden Markov models (profile HMMs) can be used to do sensitive database searching using statistical descriptions of a sequence family's consensus. HMMER is a freely distributable implementation of profile HMM software for protein sequence analysis.
Infernal is an implementation of "covariance models" (CMs), which are statistical models of RNA secondary structure and sequence consensus. You give Infernal a multiple sequence alignment of a conserved structural RNA family, annotated with the consensus secondary structure.
PhenoTips is a tool for collecting and analyzing phenotypic information for patients with genetic disorders.
PKNOTS demonstrates a dynamic programming algorithm for globally optimal RNA pseudoknot prediction.
A prototype ncRNA genefinder. QRNA uses comparative genome sequence analysis to detect conserved RNA secondary structures, including both ncRNA genes and cis-regulatory RNA structures.
(commercial) QualTrace offers a real time estimate of DNA sequence trace quality for the ABI 3730 sequencer. It can be used to identify production problems that affect read length and success rate.
RECON is a package for automated de novo identification of repeat families from genomic sequences.
RepeatFinder is a tool for finding repeating segments of DNA called motifs and then parsing them so that no repeating segments overlap - it is an exclusive matching algorithm where any part of the original DNA string is associated with one and only one match.
RNABOB is an implementation of D. Gautheret's RNAMOT, but with a different underlying algorithm using a nondeterministic finite state machine with node rewriting rules.
tRNAscan-SE detects ~99% of eukaryotic nuclear or prokaryotic tRNA genes, with a false positive rate of less than one per 15 gigabases, and with a search speed of about 30 kb/second. It was implemented for large-scale human genome sequence analysis, but is applicable to other DNAs as well.
Visomics is an open-source tool for the exploration of biological omics data with a focus on genomics, proteomics, transcriptomics, and metabolomics.