PolITiGenomics

Politics, Information Technology, and Genomics

Next-Generation Sequencing Informatics

Below is a table with informatics and IT statistics for the major next-generation/massively parallel sequencing platforms. The information in the table is approximate and should only be used for general, informational purposes.

Next-Generation Sequencing Statistics

Vendor: Roche Illumina ABI
Technology: 454 Solexa SOLiD
Platform: GS 20 FLX Ti GA GA II 1 2
Reads: 500 k 500 k 1 M 28 M 80 M 40 M 115 M
Fragment
Read length: 100 200 350 35 50 75 25 35
Run time: 6 hr 7 hr 9 hr 3 d 3 d 4 d 6 d 5 d
Yield: 50 Mb 100 Mb 400 Mb 1 Gb 4 Gb 6 Gb 1 Gb 4 Gb
Images: 11 GB 13 GB 27 GB 500 GB 1.1 TB 1.7 TB 1.8 TB 2.5 TB
PA Disk: 3 GB 3 GB 15 GB 175 GB 300 GB 350 GB 300 GB 750 GB
PA CPU: 10 hr 140 hr 220 hr 100 hr 70 hr 100 hr NA NA
SRA: 500 MB 1 GB 4 GB 30 GB 50 GB 75 GB 100 GB 140 GB
Paired-end
Read length: 200 2×35 2×50 2×75 2×25 2×35
Insert: 3.5 kb 200 b 200 b 200 b 3 kb 3 kb
Run time: 7 hr 6 d 6 d 8 d 12 d 10 d
Yield: 100 Mb 2 Gb 8 Gb 11 Gb 2 Gb 8 Gb
Images: 13 GB 1 TB 2.2 TB 3.4 TB 3.6 TB 5 TB
PA Disk: 3 GB 350 GB 500 GB 600 GB 600 GB 1.5 TB
PA CPU: 140 hr 160 hr 120 hr 170 hr NA NA
SRA: 1 GB 60 GB 100 GB 150 GB 200 GB 280 GB

Notes:

  • Units: B - bytes, b - bases
  • PA is primary analysis (includes image feature extraction and base calling)
  • PA CPU is calculated as the wall clock multiplied by the number of CPU cores
  • ABI SOLiD data are representative of a single slide
  • ABI SOLiD primary analysis is done on the instrument cluster
  • SRA is the size of the files (SFF or SRF) that are submitted to the NCBI Short Read Archive