Well, here is another quick hit from AGBT 2008. This year there were over 550 attendees and they still had to turn away over 200 people. No doubt all the interest is due to the next-generation sequencing technologies, giving labs and companies all over the world the ability to sequence DNA. Of course, just being able to generate the data does not mean you know how to deal with them. To that end, in this entry I will talk about some of the software that was mentioned at AGBT.

  • ssaha_pileup - ssaha_pileup is part of the SSAHA2 (Sequence Search and Alignment by Hashing Algorithm) package. It uses the SSAHA engine for alignments and then uses ssaha_pileup to call the consensus base at each position, thereby detecting SNP's. You can find the source code and documentation on their FTP site.
  • WikiLIMS - WikiLIMS is a laboratory information management system based on the popular MediaWiki engine. Basically the folks at BioTeam have created a few templates for MediaWiki allowing people to enter standard information about each sequencing, e.g., sample(s) information, cycles, instrument, and technician. Once those fields are populated, the user can the use then resulting page as an electronic lab notebook for the experiment. Clearly, they are targeting small labs with a couple or fewer instruments. Dick McCombie at Cold Spring Harbor Laboratory mentioned in his talk that their sequencing group uses it.
  • Anno-J - Anno-J is very cool, Web 2.0 based software for annotating genomes. The author says it is not quite ready for release, but he is planning on releasing it as free/open source software. He hopes that will encourage others to design their own tracks and plugins.
  • Glimmer - Glimmer is a system for finding genes in microbial DNA. It is not new, but it was mentioned at the conference so I thought I would include it.
  • GISTIC - Genomic Identification of Significant Targets in Cancer (GISTIC) is a program that takes as input a variety of whole genome array data from cancer samples and identifies regions which are statistically likely to play a role in cancer.
  • VCAKE - VCAKE is a short-read de novo assembler. We have found that VCAKE performs quite well when you are able to provide it with a sequence to nucleate contigs around. The authors are working on a color-space version for AB SOLiD data.