Thermo Fisher Scientific last month began shipping its Proteome Discoverer software, a major overhaul of its proteome informatics platform necessitated by the expanding capabilities of mass spectrometers used for proteomics research.
Launched at this year’s American Society for Mass Spectrometry annual conference in June, the system, which includes an in-house developed workflow engine and a slimmed-down version of the InforSense workflow technology called the InforSense Virtual Machine, is the culmination of an original-equipment-manufacturer agreement that the two firms signed in 2006.
At the time of the launch, Thermo Fisher said the software provides the most comprehensive view of quantitative and qualitative proteomic data available [See PM 06/05/08].
For Thermo Fisher, the launch signals a “next-generation” software product that “adapts to the evolving nature of proteomics,” Andreas Hühmer, proteomics marketing director at the company, told ProteoMonitor’s sister publication BioInform.
Hühmer noted that since Thermo launched its widely used Sequest proteomics search engine more than a decade ago, “the field of proteomics has significantly broadened in scope and it covers a much wider array of applications in biology and the clinic.” As a result, he said, “it really has come to the point where … it’s actually impossible for a company to make a turnkey solution for all these evolving fields.”
The company decided several years ago to replace its BioWorks platform, which included Sequest, with “a brand new platform — not just another application, but a platform — that could evolve in the future with the needs of the field,” Hühmer added.
At the heart of the new platform is the concept of the workflow engine, which allows users to easily build complex analytical pipelines through a graphical user interface. Proteome Discoverer actually includes two such engines: an in-house developed workflow engine that handles signal processing and protein identification via database searching; and the IVM workflow engine, which can annotate proteins by querying public databases like Genbank, UniProt, and the Gene Ontology.
“We’ve really taken that workflow concept to a level where we now can rapidly adapt to new and emerging workflows in proteomics, and we can do this on the side of the raw data manipulation as well as on the side of data analysis,” Hühmer said.
Thermo Fisher expects the software to help it keep pace with a very rapidly changing field. “Proteomics is not a general tool anymore. Proteomics is specifically used to get answers in a variety of fields,” Hühmer said. “One is, of course, biomarker discovery and validation. Another field is the analysis of very specific target proteins in pharma — biotherapeutics and biosimilars. And then there is the whole rapidly developing field of quantitative biology where people really can look at thousands and thousands of proteins, identify them, and quantitate them at the same time.”
“Proteomics is not a general tool anymore. Proteomics is specifically used to get answers in a variety of fields.”
All these application areas present “very new challenges because now you’re looking at a lot of data points — millions and millions of data points — and going through a sensible data reduction is not trivial,” he said.
In addition, Hühmer noted that Thermo Fisher has expanded its line of mass spectrometers in the last several years to where it offers a “breadth of technology that is a particular challenge when it comes to data analysis.”
In particular, Thermo offers two types of fragmentation technologies for most of its mass specs: collision-induced dissociation, or CID, and the newer electron transfer dissociation, or ETD, approach. ETD “requires a different way to analyze spectra,” so the company developed a new database search engine for Proteome Discoverer called Z-Core “that specifically takes advantage of the unique ETD capabilities our instruments has.”
Another advantage of the workflow approach, Hühmer said, is that Thermo Fisher can very rapidly provide new capabilities for its customers in the form of new workflows, as opposed to the more traditional model of releasing a new version of the entire software package, which can take up to a year or more to develop and implement.
With the new platform, Thermo Fisher can “rapidly deploy new capabilities that do not take an entire software development cycle,” he said. “I can literally send a new workflow via e-mail to a customer. It’s just an XML file, and it can be done in both of the workflow engines.”
In addition, Thermo Fisher’s customers can take advantage of the workflow capability. “If I am a collaborator in a lab and I develop a workflow with certain parameters, and then I want my colleague across the campus or across the country to use the very same standard workflow, I’ll send him an XML file through e-mail and he’ll be able to reproduce exactly the same workflow,” Hühmer said.
He noted that this capability could help address a persistent challenge in the proteomics field: standardization.
“There are a lot of experiments that are done without really having a standardized workflow,” he said, noting that this habit can result in very poor reproducibility across different labs, even if they use the same experimental procedures. With Proteome Discoverer, however, “we actually can drive standardization from the raw file analysis all the way to biological context using that [workflow] concept.
“Given the fact that [the Human Proteome Organization] community has defined a standard now in terms of raw files, the mzML standard, which we are supporting, we support other instruments within the capability of mzML format,” Hühmer said. “So whatever information is passed on through the mzML format from other vendors, we can read and support.”
He said that customers can also add Mascot and other third-party search algorithms to the software platform — another first for the firm. “We worry about the data reduction a lot, and that’s why we made the commitment and the investment to write the software and deliver it to our customers,” he said. “But if the customer decides that Mascot is the best search engine on the planet, we don’t want to disagree with our customer.”
Future releases of Proteome Discoverer will likely “focus more around adding label-free quantification, metabolic labeling, and then really expanding on the capabilities that we have generated with the IVM workflow,” Hühmer said. “One could imagine, for example, linking directly to pathway information.”
— A version of this article originally appeared in the Sept. 12 edition of ProteoMonitor’s sister publication Bioinform.