GAITHERSBURG, Md.--To take advantage of market demand for its proprietary object protocol model (OPM)-based bioinformatics system, Gene Logic here expanded the mission of its Data Logic division in October to intensify efforts to sell the company's suite of data management and integration tools directly to pharmaceutical companies.
Previously, the bioinformatics system was less a stand-alone product than the platform upon which Gene Logic's proprietary databases were built and with which data produced by its other, higher revenue-generating genomics technologies were made useful. Gene Logic's core products are a differential expression analysis tool, three proprietary gene expression databases, and a screening technology for drug lead identification.
To date, the two year-old company has licensed those products in deals with one agriculture-industry customer and nine major pharmaceutical companies, including a pharmacogenomics collaboration with Rhone-Poulenc Rorer Gencell that was announced two weeks ago.
Then, last May, SmithKline Beecham licensed the OPM system and software tools on its own and retained Gene Logic to use the technology to develop a series of customized databases and servers to integrate disparate public and proprietary data sources into SmithKline's datamining process. The value of that deal wasn't disclosed, but Michael Brennan, Gene Logic's president and CEO, said he expects the expanded Data Logic division to become a $15 million annual business within three years.
Brennan met with BioInform late last month to talk about the company's growth and his decision to increase bioinformatics marketing efforts. In part one of the interview, Brennan describes the OPM platform and his view of opportunities in the bioinformatics market. The conclusion of the interview, in which Brennan discusses Gene Logic's financing journey and the steps leading up to its initial public offering a year ago, will appear in the next issue, January 4.
BioInform: What's the history of the Data Logic division?
Brennan: We started that division to capitalize on the OPM set of products developed by Victor Markowitz and his team at Lawrence Berkeley National Laboratory. Initially we were looking for a way to make the data content that we generated using our genomic technologies immediately useful in the context of all the other information that a pharma company uses in the drug discovery and development process.
The major hurdle the pharma industry faces is not so much analysis of information; there are a lot of tools available to analyze genomic data, to compare sequence homologies, and do things like motif searching. The major issues are hardcore data management and integration. The lack of a data management infrastructure, rather than the lack of analytical tools, is what's held the industry back from making optimum use of the data that are available to it.
It's a very difficult problem. The kinds of queries you want to ask span databases that have different structures that may be object-oriented or relational or flat-file. There's a huge number of public genomic databases--slightly over 300 at our last countvarious formats. They're of different vintages, on different platforms, with data represented in different ways. That makes it extraordinarily difficult to construct queries that span all of these different data sources.
BioInform: How does OPM address this problem?
Brennan: OPM has been widely used in the academic community. The Genome DataBase, the three-dimensional protein database in Brookhaven, and the entire infrastructure for the German Genome Project were built using OPM. If there is any standard at all for genomics information management it's de facto OPM because those are the biggest databases that have been built.
But what had restricted OPM's adoption by the pharma industry was the fact that it was academic software. There wasn't guaranteed commercial support with the ongoing versioning, maintenance, and updates that commercial software necessitates. We decided to build a commercial enterprise around OPM. We acquired OPM and all of the rights to the existing product and we hired Victor Markowitz and his entire development team and installed them in an office in Berkeley with the mandate to develop a commercial version of OPM. That involved rewriting some of the modules and developing the documentation and the support processes that are necessary to launch a commercial version. OPM 7, the commercial version, is completed now.
BioInform: How does it work?
Brennan: OPM allows you to build new databases for complex biological information quickly and flexibly. It's not a replacement for the underlying database management system, such as Oracle or Sybase or the others. It's a high-level language that gives you a view into the structure of the database you're designing so you can design it representing objects in a consistent and very flexible way. We used it to develop our own databases that we supply to customers.
The OPM database-building toolkit enables you to design from scratch new databases of complex bioinformation. Query tools allow you to interrogate the data captured within the database. OPM allows you to build complex queries on the fly, and you don't need to be wonderfully skillful to do it. It's a click, drag, and drop process. That means you can design a query, run it, see if you're getting the information that you require, and then, if you're not, go back and, typically within hours, redesign it yourself.
BioInform: Does it eliminate the need for a programmer?
Brennan: It eliminates the need for people to go and write hardwired code in order to give you access to the data within the databases. That was one of the major features that attracted SmithKline Beecham. They do hundreds of queries, every one requiring months of programming work. This completely eliminates that issue.
BioInform: How does OPM let you interrogate databases that aren't OPM-based?
Brennan: There are a lot of legacy databases that weren't built using OPM--they were built in relational or flat-file formats. Specialized toolkits allow you to retrofit OPM views on top of existing databases to bring those legacy databases into your datamining environment. Then, using OPM multidatabase query toolkits, you can apply the querying capability across all of the databases within the framework of your data management system. Now you're asking the question not just of one database built in OPM, but of multiple databases.
The importance of being able to put these different OPM views on a database is incredible. The kinds of queries you're able to ask depend on the view of the data that you're able to take. To get out of a database the sort of answers that you require, you need to design the way you look at the information. It's like having a three-dimensional object and you're only able to look at it from one side at once.
For instance, I'm looking at a clock and I can see from the front side that it's 10:25. If I want to know how the electric socket fits into the plug at the back, I need to be able to look at that thing from the other side. OPM allows you to do that, and it's the only system that does.
BioInform: Will the Data Logic division now market these capabilities to customers?
Brennan: The objective with the Data Logic division is, in addition to using the toolkits and the expertise to build the databases that we sell, to make this capability available independent of any data content.
SmithKline was the first one that signed up. They've got a wide range of internal proprietary data, such as Human Genome Sciences' sequence information, as well as other data that they're generating themselves. They also have access to all of the public databases.
Our project there is threefold. First, we built some new databases for them to capture specific information. They also wanted to be able to build a data management system that puts OPM views on top of the public systems that they want to integrate into their process. Third, they want to make use of the OPM query capability in order to generate a range of sophisticated queries.
I can give you one example of a query. SmithKline wanted to incorporate Blast into the actual query mechanism. They wanted to run Blast as an integral part of the question asked against all of the data sources available to them--public and private sequence data, flat-file databases, and so on--and to have the query results recaptured in a database that gets continually refreshed as new queries are asked. So we built that OPM Blast server as part of our program with SmithKline.
BioInform: How big is the potential for business in this area?
Brennan: The bioinformatics market means different things to different people. If you look purely at providing analytical tools for bioinformatics, I think that's a small business. You can't command a high price for packages of analytical tools. There are lots of them available, a lot for free. I'm not saying there's no market for building clever analytical tools, but it's not a market that I think is a very high-value-generating business.
Being able to build the kind of integrated data management systems that we're talking about is a significantly bigger business opportunity, but I still don't think, today, that it's colossal. Where it's critical for us is in making our data content that much more valuable.
I know how much money you can get out of a major project with a major company in a year. Even if you sign up the top 30 pharma companies, that's not big enough to say that this is, in and of itself, an entire industry. I think it's going to be really tough to make a lot of money out of bioinformatics as a self-standing business.
We think that within the next two to three years the Data Logic business is going to be about a $15 million business annually. That's about what we get for selling databases to two major customers. We don't think we can build it into a $100 million or $200 million or $300 million revenue stream. The benefit for us is that the margins on that $15 million are huge, because the products are already built.