NEW YORK--It's been nearly two years since genomics experts first foretold that the pursuit of single-nucleotide polymorphisms (SNPs), widely considered to be the next frontier of the genomics revolution, will demand new approaches from bioinformatics toolmakers. Identifying and analyzing genetic variances among individuals and populations in order to develop more effective drugs has been characterized as postgenomic era work--a project to be tackled after the human genome sequence is completed. But recent developments indicate the challenge is more imminent, and one for which many bioinformatics companies are unprepared.
At a five-day genomics conference that drew nearly 700 industry scientists, executives, and investors to San Francisco this month, the SNP was the topic du jour. Nearly a dozen lectures, such as "Technologies for Detection of SNPs," presented by a Merck researcher, and "Large-Scale Identification, Mapping, and Genotyping of SNPs in the Human Genome," by a Bristol-Myers Squibb investigator, outlined the informatics challenges pharmaceutical companies are facing as they begin to seek out SNPs.
The same week, on March 4, a story on page one of the Wall Street Journal described how major drug companies are considering banding together to form a nonprofit venture to generate a public SNP database, "in perhaps as short a time as two years," at a cost of $50 million or more. The article explained that the aim of the pharmaceutical consortium, reportedly spearheaded by Glaxo Wellcome, would be to "prevent the biotech companies from possibly locking up all the key genetic data for themselves."
A pharmaceutical company executive who has participated in the meetings told BioInform that the venture includes a bioinformatics component. Otherwise, participants contacted by BioInform declined to comment.
The unusual alliance among major pharmaceutical competitors "opens up a new era in industrial and academic research," remarked Matthew Huang, senior scientist at bioinformatics company Pangea Systems. "This fundamentally changes the biopharmaceutical landscape to make it harder for small biotechs to follow the Incyte business model," he said.
In fact, the desire to unshackle themselves from genomic data providers such as Incyte is reportedly what is driving consortium participants. A researcher with a small technology company who attended meetings of the group last year observed that the proposal to collaboratively capture SNP information in a public database seemed born of pharmaceutical companies' frustration over paying huge sums and downstream royalties for raw EST and genomic data that they now feel was overvalued. "In the last meeting I went to they were openly saying, we're not going to end up paying Incyte twice," said the researcher, adding that one pharmaceutical executive said he was determined that his company was "not going to be raped again."
The same source said at another meeting, representatives of companies with market caps of $1 billion or less were ushered out to leave executives of bigger companies to confer among themselves. "They had us wait outside the room for three hours while they decided their position," the researcher said, adding, "It was pretty dramatic."
"Everybody has their own strategy for how to react to a single company with a huge technical lead," observed Jay Lichter, vice-president of pharmacogenomics at Genset, a company whose business plan revolves around capturing SNP data for pharmaceutical clients. "Genset has a strategic advantage over everybody in the SNP game by three years," he contended. "One company, Glaxo, was the most uncomfortable with that and they led the effort to put the collaboration together."
To be sure, the Journal's article about the consortium was no surprise to most industry observers, who first saw news snippets in Science a year ago. Nor was it definitive, according to Arthur Holden, the venture's CEO, who told BioInform that the report was "premature and in many ways not accurate." Holden declined to comment further.
New genomics tack
Still, the message to the genomics and bioinformatics communities is clear: SNPs are the next big challenge. "Probably the single most important thing that will be necessary to the success of this endeavor will be very strong bioinform-
atics to manage all this information," said Rick Shimkets, head of internal discovery programs for CuraGen, which launched its own effort to map coding-region SNPs in November. "We're really talking about hundreds of thousands of pieces of data," Shimkets said. "It has to be searchable and browsable in a very straightforward format."
Some say the new tack in genomics will require an entire shift in bioinformatics technology providers' approach. "It will change the bioinformatics scene tremendously in a way for which most players are not prepared," said Huang. He explained, "Bioinformatics has been dealing with DNA sequencing and databases. It has not been dealing with genetic markers and probability." Identifying sequence variants in genomes will require an approach that combines traditional genetics medicine with traditional bioinformatics, Huang said.
Huang characterized the current state of SNP mapping as a three-pronged effort, one prong being the new industry consortium. The other two prongs are existing projects in academic settings, including MIT's Whitehead Institute, Washington University in St. Louis, and the University of Washington in Seattle, and private companies that plan to sell SNP data, including Celera, CuraGen, Genset, Incyte, and Millennium. Combined, the biotech companies and academic labs have identified "probably half a million SNPs" so far, estimated Huang. "Within the next 12 to 18 months, I would guess another million or two will be coming up," he said.
Yet, when asked to name bioinformatics companies with software designed to handle the forthcoming barrage of SNP data, Huang named only one: "Genomica's definition of bioinformatics includes genetic information and computational genetics components," he said. "That is the right way to go."
In fact, Genomica expects SNPs to begin generating new business for its Discovery Manager software, an upcoming release of which will have extended capabilities for SNP analysis.
Susan Strong, Genomica's vice-president, said Discovery Manager is able to show pedigree linked to genotype and phenotype information. "That lets you not just look at the sequence difference, but also go back and relate that to an individual in a pattern of their family, or in a case control situation where you have unrelated individuals, or in an association study where you're looking at an isolated population." She added, "That link between the genomic side, the patient information, and the sequence side is the place where I think we're unique."
Genomics companies generating SNP data said they have developed proprietary bioinformatics tools to make use of the information. CuraGen's bioinformatics toolkit, CuraTools, is capable of browsing SNP data, said Shimkets. "We have technology to complement the consortium's efforts," he said. "Our whole suite is internet-based, and that's a technology that would be advantageous to this consortium." Does he see a role for CuraGen there? "I wouldn't say no. There may be a point when we can make some contribution."
Genset and Millennium told BioInform they view the pharmaceutical companies' consortium as a validation of their decision to pursue the area. Neither company markets its bioinformatics tools separately from its data, but both said the technology they've developed would position them to take advantage of a public SNP database, should the consortium build one.