Recent technological advances in genomics have caused something both "terrifying" and "exciting," Mike the Mad Biologist says — "a massive amount of data." Mike says that genome sequencing is already fast and cheap, but it will become faster and cheaper; the problem is evolving from how to sequence genomes to get informative data to how best to use the information we already have. "We are entering an era where the time and money costs won't be focused on raw sequence generation, but on the informatics needed to build high-quality genomes with those data," Mike says. While it's great to be able to contemplate a $100 genome, the costs of storing and using the data could be upwards of $2,500. Researchers must find ways to store the data and analyze everything that's already been sequenced. "You have eleventy gajillion genomes. Now what? Many of the analytical methods use 'N-squared' algorithms: that is, a 10-fold increase in data requires a 100-fold increase in computation. And that's optimistic," he says.
What to Do with All That Data?
Oct 07, 2010