If industrial history is any indication, deciphered protein structures will lead to many a business opportunity.
Structural Bioinformatics CEO Edward Maggio puts the moneymaking potential of protein structures into context. “Every commercial revolution in sciences has been preceded by the availability of structures,” he says. He runs down the list: chemical structures gave way to the dye and drug industries, atom structures led to nuclear energy, and polymers put forth plastics.
“We’re right at that stage in proteomics that will spawn a revolution,” Maggio says.
That computers are the artillery of this revolution is not lost on the big compute guns. IBM revealed its interest in proteins early in 2000 when it announced that job number one for its new one-petaflop supercomputer, now under construction in Yorktown Heights, NY, will be to tackle protein folding. And last month, Big Blue reconfirmed its interest in protein forms with an equity investment in Maggio’s company. (IBM says it acquired a minority share — less than 10 percent — of Structural Bioinformatics stock). According to Jeff Augen, Big Blue’s director of solutions development, “IBM wants to be involved because we think [protein structure] is an IT business.”
Maggio says his company has used its own “augmented homology modeling” method to generate structures for 220 protein families and thousands of individual proteins. IBM technology has been integral to the job. Structural Bioinformatics has used IBM’s 16-processor SP2 supercomputer, and IBM’s DB2 database is its preferred platform for serving up the data to customers.
With IBM’s help, Maggio and company will now scale up to a 128-processor Linux cluster. “To generate, store, and retrieve protein structure data requires incredible computational ability,” says Maggio.
IBM sees SBI and other companies in the field as potential major customers. Says Augen, “The largest supercomputer in the world is 12 teraflops. We routinely have companies in this area asking for two teraflops.” Structural Bioinformatics, he predicts, will ultimately scale to the range of one teraflop.
Proteomics could be fruitful for Compaq, too. Its number one genomics customer in the US, Celera, will need a major expansion of its computing power for its leap into proteomics. Celera sequenced the human genome with a 10,000-processor, 1.3-teraflop data center. But given the complexity and sheer number of proteins in the human proteome, Celera says it will need 1,000 times more power to tackle proteomics.