Politics, Information Technology, and Genomics

What's in an SRF?

I have written a bit about the NCBI Short Read Archive (SRA), its internals, and data transfer rates. Here is some information about the data format people are using to submit data from the massively parallel sequencers to the SRA. I apologize in advance for all the acronyms. The SRA is currently accepting 454 data in standard flowgram format (SFF) and Solexa in SRF format. Soon 454 and AB SOLiD will support the SRF format...

A gift for the platypus enthusiast

If you are looking for the perfect gift for the platypus enthusiast in your life, consider a t-shirt with the following image. It is even up to date with the latest research. You can get them at snorgtees.

Age quod agis

There has been a lot of studies recently that indicate that multitasking is harmful to productivity. In other words, there are significant switching costs when you mentally move from one topic to another. There is a New Atlantis article summarizing the findings entitled The Myth of Multitasking. Ironically, this story was picked up by Slashdot, one of the larger disrupters of concentration in IT. My high school algebra teacher knew all this a long time...

How much?

The Genome Center recently published a paper entitled Aspects of coverage in medical DNA sequencing that develops a model for diploid sequence coverage using data from massively parallel sequencing platforms (454, Solexa, SOLiD). It uses a known yardstick, 8× BAC or WGS coverage with capillary sequencing, to establish the equivalent coverage for the new sequencing platforms. It turns out you need about 20× to 30× redundancy using these new platforms to obtain the equivalent amount...

How fast?

In a recent post, I spoke about the data format that will be used by the NCBI Short Read Archive (SRA), but storing the data is only part of the problem. You also need to get the data to the SRA. At the 1000 Genomes Steering Committee meeting last month, we got some idea of the number of massively parallel sequencers currently in use at the large sequencing centers around the world. There are about...