PolITiGenomics

Politics, Information Technology, and Genomics

Next-Generation Sequencing Informatics Update

AddThis Social Bookmark Button

February 19th, 2010

I updated the Next-Generation Sequencing Informatics table a few weeks ago but forgot to mention it on the blog. The main update was the 50G configuration of the Illumina GA IIx. Also, the Sides & Associates blog linked to my table and referred to it as a “somewhat dated comparison of next-generation sequencing platforms.” Just to clarify, this table represents average throughput for production systems; not vendor claims about throughput, not future vaporware (and Alejandro Gutierrez corrected his description in the post once I pointed this out). As new systems come online and further improvements are made to existing platforms, the table will be updated.

Posted in IT, genomics | 8 Comments »

Tagged with: , , , , ,


You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

8 Responses to “Next-Generation Sequencing Informatics Update”

  1. Thanks for the update, that table is a great resource whenever we start doing back of the envelop calculations for potential new projects.

  2. I’m curious what you use for your source of statistics for platforms you don’t have direct access to — publications or you have a good network of other labs?

    This is very useful information!

  3. Thanks for the update and clarification. I’ve corrected “somewhat dated” on our blog with a more faithful explanation.

  4. Mate-pair and paired-end should not be confused as they are fundamentally different chemistries. Also, the SRA does not require the submission of images and so most submitters are not uploading images. Is this removed from the SRA file size (it seems quite high given our experience).

    Finally, some statistics for the SOLiD 3+ would be great as they are regularly producing 800M-1B 50+50 reads per run (2 slides).

  5. Keith, I get production numbers on the platform we do not have (SOLiD), from drd at Baylor.

    Alejandro, I saw that you updated your post, and I appreciate it. I did not mention it in this post to impugn you, rather to help make a point.

    Dirk, it’s true that mate-pair and paired-end are different, but for the purposes of this table the distinction is not important (indeed, it can easily be inferred by the size of the insert). None of the submission data is for images. It is for SRF (now ~17 B/b but ~50 B/b at one time) or gzipped FASTQ (~0.1 B/b). When the SRA has a fully functioning BAM pipeline, I have those numbers (~1 B/b). Note that I leave the SRF for the older platforms for historical reasons. As for SOLiD 3+, I am happy to post them if someone can provide them (see above).

  6. yeah.SOLiD v3 plus spec sheet will have all those numbers you need.

    http://www3.appliedbiosystems.com/cms/groups/mcb_marketing/documents/generaldocuments/cms_072050.pdf

  7. You should probably just report what you know, as the rest of these numbers look like typical GSC bias.

  8. Classy stuff, Seth Peterson, Field Application Scientist at Applied Biosystems. Perhaps you are showing your AB bias. As I said above, the numbers for all platforms are real. The SOLiD numbers come from Baylor, a very pro-SOLiD shop.

Leave a Reply