To make better sense of the human genomes that have been sequenced, Craig Venter tells Technology Review that even more should be sequenced.
"[G]enomics follows a law of very big numbers," Venter says. "I’ve had my genome for 15 years, and there’s not much I can learn because there are not that many others to compare it to."
With the launch of his new company, Human Longevity, this year, Venter aims to not only sequence tens of thousands of people, but also collect physiological data such as how much blood their heart can pump and brain size. So far, he tells Tech Review that his company has sequenced 500 people who are now beginning to undergo those additional tests.
To cope with the sheer amount of data the project is to produce and use it to make comparisons between all those newly sequenced genomes, Venter has taken on Franz Och, who led the Google Translate effort.
"Google Translate started as a slow algorithm that took hours or days to run and was not very accurate. But Franz [Och] built a machine-learning version that could go out on the Web and find every article translated from German to English or vice versa, and learn from those," Venter says. "And then it was optimized, so it works in milliseconds."
Translating the language of the human genome, Venter adds, will be an even greater challenge.