SRA International may be running the most successful bioinformatics businesses you never heard of.
The tiny bioinformatics subunit of government IT contractor SRA employs only 30 of the company’s 2,500 employees, but holds its own when it comes to revenues. The three-year-old group generated $4 million of the company’s $361 million in FY 2002 revenues — mostly in the form of contract services for NIH — and is confident that it can grow that business over the course of 2003.
“We feel like we’ve been a bit of a stealth company,” said John Greene, director of bioinformatics for SRA. “We’re getting a reputation on the NIH campus as the contractor to call when you want to do bioinformatics, but we don’t focus that heavily on the commercial market.”
Viewing bioinformatics as the quickest way to grow its Health Systems business unit, SRA began staffing the group when it won a contract to build the NCI’s intramural microarray database four years ago. The storage and analysis system, developed in a close collaboration with the NCI and the NIH Center for Information Technology (CIT), is SRA’s “flagship” bioinformatics project, Greene said, and will likely serve as the primary vehicle for the growth of SRA’s bioinformatics business over the remainder of the year. The company has already replicated the database at three other sites — the Centers for Disease Control, the Netherlands Cancer Institute, and the Genome Institute of Singapore — and is in discussions with a number of other organizations for similar installations, Greene said.
SRA has also won contracts to support bioinformatics groups at several other NIH institutes, most recently at the Institute of Neurological Diseases and Stroke, and is expanding its work with the CDC to integrate an epidemiology database with the microarray system for researchers studying chronic fatigue syndrome. In addition, SRA is also working with the CIT on parallel- izing algorithms for high-performance computing, and has emerging efforts in proteomics and biological text mining.
SRA’s primary role is that of a government contractor, and the bioinformatics group doesn’t plan to deviate far from that pattern. However, Greene noted, the group certainly wouldn’t turn down any possible bioinformatics partnerships from the commercial sector as it seeks to expand the business.
Mad about mAdb
The growth of SRA’s bioinformatics group is built upon the momentum of the NIH microarray database project, which has “become far larger than anything we ever imagined,” Greene said. The mAdb database, whose moniker Greene acknowledged is “not a real fancy acronym” for “microarray database,” now supports over 835 NIH intramural scientists and their collaborators worldwide, and stores data on more than 23,500 arrays.
Greene is one of six SRA staffers dedicated to the project at NIH, where John Powell, head of the Bioinformatics and Molecular Analysis Section of NIH, is the government project leader.
The two-tier, web-based system is built on a Sybase database and runs on a Sun enterprise server. Open source software was used wherever possible to keep costs down, Greene said, and the R statistical package serves as the foundation for an analytical toolkit that includes the usual microarray anal- ysis suspects: hierarchical clustering, k-means clustering, self-organizing maps, principal components analysis, multi-dimensional scaling, scatter plots, and statistics including T-tests, the Wilcoxon rank-sum test, ANOVA, and the Kruskal-Wallis test.
The database primarily supports NCI’s two microarray printing centers — one at the Advanced Technology Center in Gaithersburg, and one at NCI Frederick. The system has recently been modified to handle Affymetrix arrays as well, Greene said, noting that it already contains data from approximately 1,500 Affy arrays.
MAdb captures array data after the images are analyzed — NCI uses Axon’s GenePix system for this step, but Greene noted that the database works with other image analysis platforms as well. The database stores a jpeg file of the image, which is broken down into individual spots “so you have the option to see the original spot” for quality control purposes, Greene said.
Greene said when the large-scale database was originally designed, “we were having some problems with input/output bottlenecks because people might be simultaneously uploading into the system while other people are trying to pull data out for analysis.” The developers addressed this problem by creating a system that filters the data and stores it as a separate set of flat files. The analysis tools directly access this flat-file data, eliminating the I/O bottleneck.
The system also provides a feature report for every spot on each array using data from GeneCards, dbEST, LocusLink, BioCarta, KEGG, and GO. Greene said the mAdb development team held off on making the system MIAME compliant “just to let the standard settle a bit,” but expects to make it fully compliant by the end of the year.
Life Outside NIH
MAdb “was never intended to be replicated outside of NIH,” said Greene, “but people started seeing it and wanting it.” While most of the source code for the database is already open source, and the remainder will soon be made publicly available through the NIH, Greene said that the system is too complex for most research groups to simply download the code and install the database themselves. As the sole external contractor working on mAdb, SRA negotiated with NCI for the redistribution rights to the system so it could provide it to outside groups interested in running their own version.
While conceding that the system may not be “quite as slick as some of its commercial counterparts,” Greene noted that it costs substantially less than other enterprise-scale systems like Rosetta Resolver. SRA charges only for the time and services required to implement the system.
Recognizing the potential value of the mAdb system to commercial users, SRA took steps just over a year ago to launch a spin-off bioinformatics company called Tapestry. However, the effort fell victim to the recent venture funding drought, and has been shelved for the time being. “If venture money starts to flow again, I think management would start to look at that again,” Greene said, but in the meantime, SRA is sticking to its tried-and-true contractor model — the profit margins may be slimmer than in the commercial sector, but venture capital funding isn’t required to jump-start a product development effort. “VCs don’t tend to want to invest in services businesses,” Greene noted.
Greene added that SRA stands to benefit from the Competitive Sourcing, or A-76, program, implemented by President Bush in 2002, which requires that government agencies obtain commercially available goods and services from the private sector whenever possible. While the push to outsource “is making a lot of people at NIH very nervous,” Greene said, “from a contractor’s point of view, we welcome this. It means we will probably have to compete with government people for some of these positions if they do decide to outsource them, but additional markets will open up.”
And the company is also mulling a more concerted marketing push into the commercial sector, as well, according to Greene. “We suddenly realized that we’ve got 25-30 people, we’re doing about $4 million in revenues, the business is growing even this year, and very few people know about us outside the NIH campus. If more people knew about us, we might grow that business substantially more,” he said.