In a recent post, Andrew Yates asked, "How much data is a human genome?" He then did a little math to show that an uncompressed diploid human genome would easily fit on two music CDs, if stored as a text file. That got Genetic Future's Daniel MacArthur thinking. There are, he says, other formats that a human genome could be stored in, including as images from an Illumina machine. In that format, a genome with 30x coverage would take up about 28.80 terabytes. Also, if the sequence data is stored as a Sequence Read Format file, the storage space'll top out at 1.98 terabytes.
A Genome in Any Format
Jun 30, 2008