Thermo Fisher Scientific last month began shipping its Proteome Discoverer software, a major overhaul of its proteome informatics platform that is built entirely around the concept of workflow technology.
The system, which includes an in-house developed workflow engine as well as a slimmed-down version of the InforSense workflow technology called the InforSense Virtual Machine, is the culmination of an original-equipment-manufacturer agreement that the two firms signed in 2006 [BioInform 10-13-06].
For InforSense, the software represents the first third-party product to include an embedded version of its workflow tools; while for Thermo Fisher, the launch signals a “next-generation” software product that “adapts to the evolving nature of proteomics,” Andreas Hühmer, proteomics marketing director at the company, told BioInform.
Hühmer noted that since Thermo launched its widely used Sequest proteomics search engine more than a decade ago, “the field of proteomics has significantly broadened in scope and it covers a much wider array of applications in biology and the clinic.” As a result, he said, “it really has come to the point where … it’s actually impossible for a company to make a turnkey solution for all these evolving fields.”
Hühmer said the company decided several years ago to replace the BioWorks platform, which included Sequest, with “a brand new platform — not just another application, but a platform — that could evolve in the future with the needs of the field.”
At the heart of the new platform is the concept of the workflow engine, which allows users to easily build complex analytical pipelines through a graphical user interface. Proteome Discoverer actually includes two such engines: an in-house developed workflow engine that handles signal processing and protein identification via database searching; and the IVM workflow engine, which can annotate proteins by querying public databases like Genbank, UniProt, and the Gene Ontology.
“We’ve really taken that workflow concept to a level where we now can rapidly adapt to new and emerging workflows in proteomics, and we can do this on the side of the raw data manipulation as well as on the side of data analysis,” Hühmer said.
Another advantage of the workflow approach, Hühmer said, is that Thermo Fisher can very rapidly provide new capabilities for its customers in the form of new workflows, as opposed to the more traditional model of releasing a new version of the entire software package, which can take up to a year or more to develop and implement.
Hühmer acknowledged that the two-year development timeframe for Proteome Discoverer is lengthy compared to most bioinformatics projects, but noted that the implementation of the workflow concept “is not something you do in a normal development cycle,” and that it “required specific effort.”
The “upside” of that, he said, “is that we now have the capability to rapidly deploy new capabilities that do not take an entire software development cycle. I can literally send a new workflow via e-mail to a customer. It’s just an XML file, and it can be done in both of the workflow engines.”
In addition, Thermo Fisher’s customers can take advantage of the workflow capability, he said. “If I am a collaborator in a lab and I develop a workflow with certain parameters, and then I want my colleague across the campus or across the country to use the very same standard workflow, I’ll send him an XML file through e-mail and he’ll be able to reproduce exactly the same workflow.”
Hühmer noted that this capability could help address a persistent challenge in the proteomics field: standardization.
“We now have the capability to rapidly deploy new capabilities that do not take an entire software development cycle. I can literally send a new workflow via e-mail to a customer.”
“There are a lot of experiments that are done without really having a standardized workflow,” he said, noting that this habit can result in very poor reproducibility across different labs, even if they use the same experimental procedures. With Proteome Discoverer, however, “we actually can drive standardization from the raw file analysis all the way to biological context using that [workflow] concept.”
Perhaps more importantly, Thermo Fisher expects the software to help it keep pace with a very rapidly changing field. “Proteomics is not a general tool anymore. Proteomics is specifically used to get answers in a variety of fields,” Hühmer said. “One is, of course, biomarker discovery and validation. Another field is the analysis of very specific target proteins in pharma — biotherapeutics and biosimilars. And then there is the whole rapidly developing field of quantitative biology where people really can look at thousands and thousands of proteins, identify them, and quantitate them at the same time.”
All these application areas present “very new challenges because now you’re looking at a lot of data points — millions and millions of data points — and going through a sensible data reduction is not trivial,” he said.
In addition, Hühmer noted that Thermo Fisher has also expanded its line of mass spectrometers in the last several years so that it now offers a “breadth of technology that is a particular challenge when it comes to data analysis.”
In particular, Thermo offers two types of fragmentation technologies for most of its mass specs: collision-induced dissociation, or CID, and the newer electron transfer dissociation, or ETD, approach. ETD “requires a different way to analyze spectra,” so the company developed a new database search engine for Proteome Discoverer called Z-Core “that specifically takes advantage of the unique ETD capabilities our instruments has.”
Hühmer said that customers can also add Mascot and other third-party search algorithms to the software platform — another first for the firm. “We worry about the data reduction a lot, and that’s why we made the commitment and the investment to write the software and deliver it to our customers,” he said. “But if the customer decides that Mascot is the best search engine on the planet, we don’t want to disagree with our customer.”
Future releases of Proteome Discoverer will likely “focus more around adding label-free quantification, metabolic labeling, and then really expanding on the capabilities that we have generated with the IVM workflow,” Hühmer said. “One could imagine, for example, linking directly to pathway information.”
Building an OEM Base
For InforSense, Proteome Discoverer represents the first fruits of an OEM strategy that it kicked off two years ago when it first signed the development agreement with Thermo Fisher.
Since then the company has signed several other OEM deals — one with an undisclosed instrumentation vendor that is slated to launch its software by the end of the month, and two others that are in “various stages” of development, Joe Donahue, senior vice president of sales at InforSense, told BioInform. Under one such agreement, announced in February, Shanghai, China-based telecommunications firm Hua Wei Technology will use IVM to build its internal business intelligence platform.
“The broader IVM and OEM strategy is a key one for us, especially as we expand across horizontal markets,” Donahue said. He added that the Thermo Fisher agreement serves as proof that “the technology platform, and IVM in particular, has got the robustness that we can give it to leading companies, and they’re willing to invest in it and use it as the platform for their own products going forward.”
Jonathan Sheldon, chief scientific officer of InforSense, described IVM as “a cut-down version” of the company’s software that allows partners to embed the InforSense analytical workflow engine into third-party applications “in a lightweight way.”
Donahue added that IVM is “cut down, not from a functionality perspective, but from a footprint standpoint, which makes it easy to embed in other pieces of software, in instruments, in mobile devices.”
InforSense sees a key opportunity for IVM in the translational medicine market — a sector that it has already identified as a promising area for its software. In April, the company announced a three-year collaboration with Dana-Farber Cancer Institute to develop a translational research informatics infrastructure [BioInform 04-18-08], and it followed that in May with an agreement with LabVantage Solutions to integrate its software with the company’s Sapphire LIMS with the intention of targeting biobanking management [BioInform 05-02-08].
This week, Sheldon said that IVM might offer particular advantages to partners in translational medicine and diagnostics. “What we’re finding is that people can use the platform, say our GenSense product, to look at associations between genotype and phenotype, but then with the InforSense Virtual Machine they have a very powerful way to take that knowledge and embed that, say, into a diagnostic application that might be used at the point of care.”
He said that InforSense believes this capability will be of particular interest to diagnostic equipment providers “because those guys are going to want to have software with their machines.” He said that diagnostic test developers could use IVM to build “an analytical workflow that processes the information coming off that machine, but also then gives you some kind of therapy recommendation” based on publicly available data.
Sheldon added that the four undisclosed OEM customers that InforSense is currently working with are “involved in that process and are in various stages of getting out to the patient,” but declined to provide further details.