DiscoveryLink is coming. Will it live up to the hype?
by Adrienne Burke
Any day now IBMs long awaited data-integration middleware for the life sciences market will be, as they say around the Big Blue marketing department, GA. The product, whose origin dates back six years, will be assigned an SKU number and become, company officials promise, generally available for commercial purchase in the second quarter. This will be IBMs moment of truth in the life sciences.
The company started, as a spokeswoman says, going after this market in a very aggressive way about 18 months ago. But in a space where Compaq dominates in high-performance computing, where Oracle is the standard database technology, IBM needed to be clever to compete.
Clever indeed. Its strategy for picking up new life sciences service contracts will be to offer the industry what it seems to need most: a killer app for integrating genomic data.
In mid-May, DiscoveryLink a federated data management tool that employs wrappers to link varied data from multiple, disparate databases existed only as a prototype. Two unnamed pharmaceutical companies were running pilot tests and NetGenics and Incyte had projects underway to incorporate it into their own yet-to-be released drug discovery tools, DiscoveryCenter and Genomics Knowledge Platform, respectively.
The products lack of availability, however, was not keeping IBM from hyping it. Janet Perna, IBMs general manager for data management, called DiscoveryLink the silver bullet, diamond-in-the-rough technology that will change the way drug development data is handled. And a press release announcing IBMs partnership with BioQuebec called IBM the most advanced computer company in the world in terms of bioinformatics.
Not surprisingly, that sort of talk from a life sciences latecomer has the hackles up in many veterans of the genomics computing and data integration businesses. In this small, tight-knit industry, IBMs bald ambition is ruffling feathers.
DiscoveryLinks Garlic Breath
DiscoveryLinks earliest incarnation was in the mid-1990s as the pet project of computer scientist Laura Haas at IBMs Almaden research facility. Known internally as Garlic, the project aimed to enable the integration of heterogeneous data sources in a single, cross-source query. Almaden Web pages dated from 1995 expound on the technologys possibilities for helping interior decorators store information on wallpapers, kitchen cabinets, appliances, and floor tiles or for enabling hospitals to store lab reports, MRI scans, and EKGs in one place.
Jill Mesirov, CIO of the Whitehead Institute Center for Genome Research, who did a stint as IBMs manager of bioinformatics and computational biology from 1995 to 1997, recalls that it was clear at the time that Garlic held promise as a drug discovery tool.
It was around 1995 when IBM entered into a hush-hush deal with Merck to codevelop Garlic as a pharma industry tool. Independent IT consultant Arthur Thomas of Proteus Associates, who says he has been involved in a number of briefings that IBM has given to pharmas, considers Garlic one of the most innovative things thats been done in this area.
But, observers say, IBM seemed to lose interest in the life science market. Merck fell out of the picture and Garlic didnt resurface until 1999, when IBM made a $2 million investment in bioinformatics firm NetGenics and rekindled the Garlic product development in partnership. Mesirov counts the current campaign as the third time IBM has made a big push in life sciences.
Pushy PR
This time around, some in the industry are accusing the computer giant of pushy tactics and of overstating DiscoveryLinks capabilities. They suggest that the computer giant is buying its way into the market.
Since January 2000, IBM has made a series of equity investments (see timeline) binding its beneficiaries to install the DiscoveryLink-enhanced version 8.0 of the DB2 database when it is delivered. In addition, IBMs minority stakes in LabBook and NetGenics are designed to expand its reach into the sector and elicit tools optimized to run on IBM hardware.
Jeff Augen, director of strategy for IBMs life sciences group, says the companys goal is not to compete with bioinformatics vendors or even Oracle, but to leverage Discovery- Link to win service contracts. We believe there are services required. The more important part of this is the infrastructure required, he says.
Thomas observes: More than half of IBMs revenues come from services. They made a billion-and-a-half dollar deal with Aventis to build a research infrastructure. Theyre interested in selling services.
Bristling Bioinformaticians
Still, other data integration technology vendors bristle at IBMs approach, which seems antithetical to the open and collaborative ways that computing vendors such as Compaq and Sun do business.
For instance, Bill Blake, vice president for high-performance technical computing at Compaq, says his strategy is: Well do the best job on hardware, and there are better companies at building the additional layer.
Blake says that Compaq, which has partnerships with Oracle, InforMax, and Lion Bioscience, prefers to allow best-of-breed third parties to come in to provide middleware solutions.
Sun, which is in fourth place in life science market share after Compaq, IBM, and HP (according to IDC data), is known among genomics companies for having established collaborative efforts such as its Informatics Advisory Council and the industry-wide I3C, which aims to develop an open-source platform for integrating life sciences data and tools.
When IBM joined the I3C recently, its bullying approach didnt sit well with other members. They pissed off a lot of people by saying they want to be part of the I3C but saying, We dont code, we just provide the plumbing. It was really ridiculous, says one member, who added that Big Blue reps went out of their way to make amends at a subsequent meeting.
Friedrich von Bohlen, CEO of Lion Bioscience, whose SRS data integration technology is widely used throughput the industry, notes that not one of his 50 customers has ever asked to have SRS optimized to run on IBM hardware or database products. He asks rhetorically: Why does IBM need to use equity to get customers? Because in a free market no one would choose their solution. Von Bohlen says he is skeptical about whether IBMs approach will convince the highly scientific community [that] wants the best solution support, not mere technology.
InforMax CSO Steve Lincoln says DiscoveryLink is one of several interesting technologies for data integration, but the important thing about integration is that its a piece of enabling technology. Just because you can build a data warehouse environment doesnt mean youve told me what a toxicologist is going to do with gene expression data. Lincoln says that for that reason, InforMax is focusing more on understanding why you would do this and what it means.
Nevertheless, the IBM threat exists for data integration tools providers: Big Blue has the resources to provide the applications, hardware, and services, and steamroll over smaller providers of integration technology and their partners. Says Arthur Thomas, There may be an element of IBM throwing its weight around, but at the end of the day you would be foolish to underestimate them. They have the resources, and any technology thats required they can either build or acquire from the outside.
Proof and Perception
To be sure, the specter of going head-to-head with the limitless resources of IBM would strike fear in any small-by-comparison bioinformatics company. The reactions of small vendors could be just sour grapes. But to anyone who has invested significant time and money struggling with the genomic data integration problem, IBMs self-assuredness is understandably hard to swallow.
NetGenics VP Beth Sump-Kleinhenz, says that while her team recognized [DiscoveryLink] early on as a very nice enabling technology, she will be surprised if a single integration solution for drug discovery ever emerges. Theres not one key solution, there cant be one, she says. Integration is a tough problem.
Meanwhile, IBMs trouble getting acceptance from the industry seems more to do with how its perception of irself differs from what outsiders see than with its technology. Big Blue doesnt see itself as the newcomer. Scientists at the companys Almaden research facility were working on this technology before some bioinformatics companies existed. Says IBMs Augen, We spent many years developing the optimization built into DiscoveryLink. Were very proud of that technology.
And while Augen claims that DiscoveryLink has capabilities built into it that those small companies cant develop, he sees them as features that would enhance Lions capabilities.
Sump-Kleinhenz, who says she hasnt seen anything else that purports to deal at [the same] level of complexity as DiscoveryLink, offers this example for how a pharma scientist might use the tool: Given that I am interested in proteins from gene family X, and specifically in those family members that have a structure Y, please provide me a list of chemicals containing a specific R-group (Z) that inhibit these proteins by 10-fold at a concentration of less than one micromolar, and which are present in our combinatorial chemistry library in a quantity of more than 10 mg.
To do such a search without DiscoveryLink, she says, would require separate searches, including, for example, keyword searches on GenBank, SwissProt, and any internal databases; a motif search; a chemical substructure search on one or more chemical libraries; a series of searches on high-throughput screening results; and an inventory search of a chemical library.
Sump-Kleinhenz says, Once the wrappers to the individual databases have been integrated with DiscoveryLink, its job is to deconstruct the query and bring the final result back to you in the form of Here are three compounds that meet your criteria.
Sure sounds promising. But as IBMs Augen acknowledges, I guess now the onus is on us to prove it.
1995
First mention of Garlic on IBM website refers to data integration applications for kitchen design and advertising agencies
IBM begins working with Merck to develop Garlic for pharma applications
1999
IBM invests $2 million in NetGenics, begins collaborating on DiscoveryLink development
2000
January: IBM makes $10 million equity investment in MDS Proteomics; makes MDS''s future use of DiscoveryLink contractual
August: IBM announces that it has $100 million to invest in life sciences partners over the next 30 months.
September: Incyte says it will embed DiscoveryLink into its Genomics Knowledge Platform product
November: IBM takes minority stake in Structural Bioinformatics; in focus group gathered at industry conference, IBM asks industry players what it will take for IBM to break into this market
December: IBM says it will build a 7.5-teraflop computer cluster for NuTec Sciences; also says it may double the $100 million available for life sciences partner investments
2001
March: IBM establishes Global Life Sciences Consulting and Solutions unit
April 11: IBM takes equity stake in LabBook, LabBook to build front end to DiscoveryLink
April 24: IBM buys Informix for $1 billion, says it will jointly develop and market information management systems
April 26: NetGenics says it will integrate DiscoveryLink into its Discovery Center platform and will be a DiscoveryLink distributor
May 3: IBM partners with BioQuebec and says its Net Generation Life Sciences division has "several hundred million to invest in firms operating life science and IT"
May 8: Discovery Link is not yet commercially available, nor are products from NetGenics, Incyte, or MDS Proteomics that promise to integrate DiscoveryLink