Cira Discovery Sciences claims that its data-mining technology can bend and stretch to handle biosequence analysis, cheminformatics, and gene expression and SNP analysis. But rather than showcase its proprietary approach in one of these well-established analytical realms, Cira, a Philadelphia-based startup, has chosen the burgeoning, yet controversial, field of proteomics-based diagnostics for one of its first proof-of-principle collaborations. This week, the company announced a collaboration with the Wistar Institute to develop pattern-based diagnostics for lung cancer.
Cira, which raised $100,000 in seed funding from Pennsylvania life science funding organization BioAdvance in mid-February, plans to use the partnership to “establish with real scientific credibility that our integrated approach is capable of developing diagnostic information,” said Wade Rogers, the company’s CEO.
The field of proteomics-based diagnostics, which is based on patterns of protein expression rather than on individual biomarkers, dovetails nicely with Cira’s approach, Rogers said, because it’s designed to look at the data pattern rather than the elements. The company’s technology assumes that the information in high-dimensionality data sets “is carried primarily in the interactions of the dimensions, not the individual dimensions themselves,” according to Rogers.
Cira joins a growing number of groups that are developing computational approaches to screen patients for cancer using mass spectrometry data. Since the publication of an initial ovarian cancer study by Correlogic, the FDA, and the NCI in the Lancet in 2002 [BioInform 02-25-02], the field of proteomics-based diagnostics has blossomed, based on the hope that a low-cost blood-based cancer screening test may be within reach. Lately, however, early approaches to classifying proteomics data sets have come under fire from critics who question the statistical rigor of current methods, the reproducibility of the data, and whether pattern-based diagnostics will ever gain acceptance without knowledge of the underlying proteins. [See the 02-13-04 issue of BioInform’s sister publication ProteoMonitor for further details on these issues.]
Rogers said that Cira is well aware of the growing controversy in the field, but indicated that the company’s collaboration with the Wistar Institute will differ from prior proteomics-based diagnostics approaches in several ways. First of all, he said, the company hopes to circumvent the questions about sample collection and experimental methods that have plagued the publicly available FDA/NCI data set by instead using new data from David Speicher’s proteomics lab at the Wistar Institute. “We’ll have control over the acquisition of the data, and direct, first-hand knowledge of how the data has been acquired,” Rogers said.
In addition, Rogers said, the company uses signal-processing techniques “that concentrate the information contained in a mass spectrum onto a smaller number of significant components,” which removes noise from the raw spectral information without losing significant information.
Finally, he added, Cira’s core data-mining technology differs drastically from Correlogic’s genetic algorithm and other pattern-discovery approaches, which rely on heuristics or a priori assumptions to reduce the dimensionality of the pattern discovery problem and therefore overlook some of the patterns in large data sets. Cira’s method is “exhaustive and complete,” Rogers said — an approach that would normally be computationally intractable because the number of calculations required to solve the problem grows exponentially with the size of the input. But Rogers said that Cira has found a way around this limitation via a “polynomial solution” that can find all the patterns in the data “in a reasonable amount of computing time.”
Cira has tested its pattern discovery approach on other types of biological information besides proteomics data. In a white paper available through its website, Cira researchers discuss their analysis of the Gabriel et al. SNP data set used to prove the feasibility of the International Haplotype Map project in 2002 [BioInform 05-27-02]. The company said its approach was able to identify more than 115,000 patterns at a relatively low support level in just 21 seconds on a 2 GHz Pentium 4 with 1 Gbyte RAM.
The company is also working with its first industry collaborator — Infinity Pharmaceuticals — on a cheminformatics project, Rogers said. The partnership was initiated by Dennis Underwood, who chairs Cira’s scientific advisory board and is vice president of computational sciences and discovery informatics at Infinity.
Underwood, Rogers, and Cira co-founder Allan Moser worked together as bioinformatics researchers at DuPont Pharmaceuticals in Wilmington, Del. When Bristol-Myers Squibb bought DuPont Pharmaceuticals in 2001 and later decided to close down the Wilmington site, Rogers said, “I was offered a severance package that I couldn’t refuse, and it was the thing that gave me the opportunity to do something that I’ve been wanting to do for some time.” Ceding that there was a good amount of “stark terror” involved in the decision to launch a startup bioinformatics company in today’s economic climate, Rogers said he still believes that “the space we work in is underserved, and there is a big opportunity here to make an impact on healthcare.”
In addition to its seed funding from BioAdvance, Cira has additional support from an organization called Innovation Philadelphia that helps technology companies write federal grant applications. The company was also awarded a scholarship that provides a portion of its office space in the University City Science Center Port — a technology incubator located near several Philadelphia research organizations and universities, including the Wistar Institute.
Rogers said that the company’s main short-term goal is to prove its technology through projects like the Wistar proteomics collaboration before seeking additional funding. “I think investors for the most part are interested in proven technology, so we intend to be at that stage by the end of the year,” he said.