Nat Goodman hosts the bioinformatics version of ABC’s TV game show
Before you spend a lot of time racking your brain for the answers, here they are: The $100 question is a no-brainer; The answer is D. For $32,000, the answer is most likely A — none of the public bioinformatics companies is profitable, so the most profitable one must be some small, private company that none of us is aware of. For $1 million, your guess is as good as mine. If you know, drop me a line and we’ll talk term sheets.
My point is simple: It doesn’t take a genius to recognize that bioinformatics is essential for doing anything useful with the piles of data produced by big biology. But no one has yet figured out how to convert this potent technical value into substantial commercial value. The really hard question — the one you have to answer correctly to become a millionaire — is to devise a business model that can translate the technical necessity for bioinformatics into sustained profit.
Rules of Play
Bioinformatics is the scientific and engineering discipline concerned with the development and use of computational methods for biological research. Since bioinformatics is hot today — the dotcom du jour — business analysts tend to apply the label somewhat liberally.
I prefer a narrow, pure-play definition: the bioinformatics industry produces goods and service for biological research whose value derives from computational competence. This includes software products, software-related consulting, software-enhanced databases (such as Proteome’s YPD and Compugen’s LEADS), and software-driven discovery. It excludes proprietary data products (such as Celera’s genome databases and Incyte’s EST databases), laboratory-driven discovery, and other products embodying considerable laboratory expertise.
Bioinformatics today is a puny market. I’ve seen analyst reports pegging the market at anywhere from $175 million to $265 million in revenues. My colleagues and I at 3rd Millennium did a bottom-up analysis in which we summed the revenues reported by all the public bioinformatics companies; this is an inexact calculation, because many of the companies are hybrids with revenues from nonbioinformatics sources as well.
After weighing the evidence and tossing in a dash of salt, my best guess is that the current market is about $200 million. To place this in context, all of bioinformatics barely reaches the big toe of Microsoft, whose revenue tops $20 billion.
The pundits and stock pickers want us to believe that bioinformatics is a hot market poised for enormous growth, with projections of $1 billion to $2 billion in revenue by 2005. This implies rather spectacular growth indeed: It will take about 40 percent annual growth to get from the current base of $200 million to $1 billion in 2005, and 60 percent annual growth to reach $2 billion over that time frame. For context again, Microsoft’s growth rate during the PC and NT server explosion was between 25 and 30 percent.
The bullish estimates for bioinformatics growth are based on a belief that big pharmaceutical companies will rapidly assimilate large-scale biotechnologies throughout their R&D processes. The theory is that big pharma wants to mimic the genomics-based discovery companies, such as Millennium and Exelixis. This theory leads inexorably to large increases in bioinformatics since companies like Millennium are thought to spend 15 to 25 percent of their R&D budgets on bioinformatics, in contrast to the sub-1 percent that big pharma spends today.
This belief is seductive to those of us who work in biotech, but it is highly speculative. While I’m certain that big pharma will drive ahead with technology, I’m not convinced this will happen terribly fast. Nor is it obvious how complete the endpoint will or should be.
Will big pharma R&D become 90 percent “techified” by 2005? Or will it just be 10 percent? This has a huge effect on the size of the bioinformatics market. In conversations with R&D executives, the general message is that change is happening much more slowly than the market cheerleaders would have us believe.
The pharmaceutical industry worldwide, including traditional big pharmas and up-and-comers such as Amgen and Genzyme, spends about $45 billion per year on R&D. About one-third of the R&D total ($15 billion) is spent on research, and about $7 billion of this is on areas relevant to bioinformatics, namely, biological discovery research. Pharmaceutical R&D expenditures are growing at about 10 percent per year, leading to an estimate of $11 billion for biological research in 2005.
The heterogeneous biotech industry is another component of the bioinformatics market. It includes small pharmas, companies involved in focused biological research, firms that develop technology (including bioinformatics), and “omics”-based discovery and database companies.
I considered the influence of small pharmas in the pharma discussion earlier, and companies in the second and third categories above are not major users of bioinformatics. The final “omics” category is composed of major bioinformatics users. My best guess is that this segment spends about $2 billion on biological research today. If we project 20 percent growth, this leads to about $5 billion in biological research in 2005.
Academia also spends a lot on biological research. The world’s largest funding source for such research, the US National Institutes of Health, has a budget of about $15 billion. This is a lot more than pharma and biotech spend on biology.
Despite the importance of this market segment in practice, I’m going to ignore it for purposes of this analysis: It’s too hard to extract the fraction that falls within our scope, and differences in price sensitivity between industry and academia make it hard to normalize market projections across these segments.
Leading “omics” companies claim to spend 15 to 25 percent of their R&D budgets on bioinformatics. This includes internal and external expenditures on hardware, software, services, and data. The part we care about here is external purchases of software and software-related services and data. The rule of thumb in the computer industry is that about one-third of IT budgets go for externally purchased software and related services; this seems a reasonable guess here, too.
Let’s add it all up. If pharma becomes 50 percent techified by 2005, and if its techie projects spend 20 percent of their budgets on bioinformatics, and if one-third of this is spent on external bioinformatics products, we get $363 million. “Omics” companies add another $333 million to the pot. The grand total, $700 million, is far short of the bullish estimates we’re seeing in the press. We can push the number up to $1 billion by assuming that pharma becomes 100 percent techified by 2005, but only a game show host or stock analyst could be that optimistic.
Use Your Lifelines
Even if the market gets as big as people hope, there remains the challenge of creating a valuable company. In addition to all the usual startup hurdles — assembling a strong management team, hiring good technical and business people, having a good idea, raising money, executing the idea, hitting the market at the right time, raising more money, finding customers, etc. — there are a number of industry-specific obstacles that have bedeviled most entrepreneurs who’ve tried to play the bioinformatics games.
One significant difficulty is that the market is exceedingly diffuse from a technical standpoint. Imagine a matrix whose columns represent different kinds of biological data — sequences, structures, microarrays, etc. — and whose rows represent different kinds of software — analysis programs, visualization tools, databases, and so forth. Each cell in this matrix is a different technical niche requiring a unique combination of scientific and software skills.
There are probably 50 or more different niches embedded in this one market, fragmenting the presumed $1 billion total market into 50 $20 million submarkets. This means that a big company has to span multiple niches. This is generally a tough goal for technically oriented startups, whose claim to fame is expertise in one particular area.
To beat this problem, a number of companies have developed enterprise integration platforms that provide analytical software, visualization software, user interfaces, and databases across many types of data. A long string of contestants have failed at this, but a few are hanging in there despite large losses. One is InforMax, which reports losses of $11 million on revenues of $17 million. Another is Lion Bioscience with losses of $12 million on revenues of $10 million.
Phone a Friend
A big challenge for enterprise integration companies is that customers usually want the freedom to easily incorporate their favorite programs — often the latest and greatest innovations coming out of academia — while vendors want customers to stick with the software that comes with the platform in order to sustain high pricing.
I was curious to see how InforMax and Lion have been working this key issue. I went to Cambridge Healthtech Institute’s Drug Discovery and Development Deals Database to see what I could learn. I found that both companies are doing a lot of deals to bring proprietary tools and databases onto their platforms. For example, InforMax has signed a deal to deliver AxCell’s protein-protein interaction data via InforMax, while Lion has inked a pact with Celera to serve Celera’s genomic data via Lion. It remains to be seen whether these types of deals will satisfy customers’ yearnings for freedom of choice, but it’s surely a step in the right direction.
Ask the Audience
A problem for vendors of all stripes is the continuous influx of excellent new software, mostly from academia. This steady stream of new competitors puts price pressure on existing vendors and forces them to maintain high levels of R&D spending. This is great for customers, but bad for profits.
We can watch this unfold right before our noses in the microarray niche. Silicon Genetics has worked long and hard to create a strong product and establish a lead over several of its competitors. Now, as I discussed in March, it’s faced with new competition from an academic newcomer called J-Express. And even more significantly, all the microarray software vendors are facing pressure from academic groups who are developing radically new statistical methods for analyzing microarray data.
This brings to mind one central difference between great academic work and a successful commercial product: Good science is reproducible; good products are not.
The current Millionaire mentality is great for customers in the short run, but the excitement at seeing contestants struggle and fail gets old pretty quick. Mature customers need mature vendors who in turn need stable markets to sustain reasonable profits.
Perhaps a consortium of pharmaceutical companies patterned on the highly successful SNP Consortium could solve the problem. The consortium would raise money from its members and pay software companies to create the software products that the industry needs.
But this would require getting the software guys to stop playing Who Wants to Be a Millionaire, and start playing Who’s Satisfied With a Decent Living and Fulfilling Job. It’s going to take a few more failed bioinformatics companies before we see a lot of contestants lining up to play this new game.
ABC’s Who Wants to Be a Millionaire
Pharmaceutical Research and Manufacturers of America (PhRMA)
Biotechnology Industry Organization (BIO)
Cambridge Healthtech Institute’s Drug Discovery & Development Deals Database