NEW YORK (GenomeWeb) – AB Sciex and Illumina this week announced the launch of their OneOmics project, an exclusive partnership between the two companies to integrate proteomics and next-generation sequencing analysis.
Under the partnership, AB Sciex will place its Swath Proteomics Cloud Tool Kit, a suite of informatics tools for use with the company's Swath mass spec technology, in Illumina's BaseSpace cloud computing environment.
The goal of the effort is to provide users with more streamlined methods for integrating genomic and proteomics data while also fostering technology development in this area, said Aaron Hudson, senior director of the academic and omics business at AB Sciex.
"It's not like there are standard ways to start integrating this data," he told ProteoMonitor, noting that, while BaseSpace has added through the OneOmics project a small selection of tools for integrating proteomics and genomics data, "there is a lot of innovation that can go on top of that to get the most out of the omics data."
Though still in its early stages, integration of proteomic and genomic data has become an area of significant research interest of late.
For instance, the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium aims to combine protein biomarker discovery and verification studies in tumor tissue samples with genomic characterizations of those same samples done by the NCI's Cancer Genome Atlas project.
In May, a team led by Johns Hopkins University researchers published in Nature a mass spec-based draft map of the human proteome that, through integration of various levels of genomic data, identified 808 novel annotations of the human genome, though some of these findings have been called into question by outside researchers.
More generally, proteomics researchers have begun exploring the potential of RNA-seq for creating sample-specific proteomic search databases, which allows the identification of protein variants not present in generic search databases.
"Proteomics both in research and clinical settings is huge in being able to paint a picture about the way a pathway works or the way a drug is interacting with a cell or the way diseases progress," said Jordan Stockton, Illumina's director of product marketing, computational biology.
"I think [adding proteomics capabilities to BaseSpace] is consistent with Illumina's strategy around integrating as many different data types as possible," he added. "We look at genomics as only one piece of a huge puzzle in understanding biology and human health."
The OneOmics project revolves – on the mass spec side – around AB Sciex's Swath technology, a data-independent acquisition mass spec approach. In DIA approaches like Swath, the mass spec selects broad m/z windows and fragments all precursors in that window, allowing it to collect MS/MS spectra on all ions in a sample. While DIA approaches typically identify fewer proteins than conventional data dependent acquisition mass spec, they are more reproducible.
This reproducibility, Hudson said, makes DIA proteomic data well suited to integration with genomic data.
"You can run 100 samples and quantify and identify a few thousand proteins reproducibly, so that starts to bring [proteomics] into the kind of capabilities Illumina has had for a while with RNA-seq and the next-gen sequencing," he said.
"If you're going to start comparing data from different disciplines they need to be comparable in different contexts, so [using Swath] made a lot of sense," Stockton agreed.
The project launches with several proteomics-focused applications available on BaseSpace, including an app developed by Yale University researcher Christopher Colangelo for integrated RNA-seq and Swath data, which enables generation of sample-specific proteomic search databases from RNA-seq data. Also available is the Institute for Systems Biology's SwathAtlas, a tool for planning Swath experiments and depositing and searching Swath datasets.
The Swath Cloud Toolkit AB Sciex has added to BaseSpace consists of four apps for proteomics work: Protein Expression Extractor, for processing raw mass spec data; Protein Expression Assembler, for protein fold-change analysis; Protein Expression Browser, to visualize results in a biological context; and Protein Expression Analytics, for data quality review.
Some capabilities of these apps, such as processing raw Swath data and identifying and quantifying proteins within it are also available within AB Sciex's standard Peak View software, Hudson said. Other tools, particularly some of the data visualization tools, however, are only available through the BaseSpace app, he noted.
"This is not just rehashing the same software and putting it into the cloud," Hudson said. "It's about trying to solve biological problems. If you look at the apps, you don't actually see a mass spectrum anymore – you see up and down regulation of the proteins and some biological insights as well, gene ontology, for instance."
This approach fits with the larger notion behind BaseSpace, Stockton said.
"Our goal with BaseSpace is really to allow people to ask biological questions of the data and to make the data processing effectively go away," he said. "We thought a cloud-based infrastructure to allow people to [to run experiments] in a fairly push-button way was enabling for this new class of biologists who really don't want to get involved in what is fundamentally a data processing step."
Cloud computing also has the advantage of significantly shortening analysis time, Hudson said, noting that an analysis of 100 samples that took company researchers three days on a high-end desktop took one hour in BaseSpace.
Stockton said the hope was that the cloud environment would also help foster additional development of tools for integrating various omics analyses.
"One of the ways to think about this is as a foundation," he said. "By pulling in and processing Swath data and putting it in a format that is relatively standard, we give the academic community a platform to try new things and popularize new techniques for integrating [proteomics and genomics data]."
Stockton said the OneOmics project was exclusive to AB Sciex for now, noting that Illumina is "really interested in what the Swath technology can do because of its unique reproducibility."
Other vendors, including Waters and Thermo Fisher Scientific, offer similar DIA mass spec techniques that feature improved reproducibility compared to DDA workflows. At the American Society for Mass Spectrometry annual meeting in June, Thermo Fisher introduced four different types of DIA methodologies for use on its Q Exactive, Q Exactive HF, and Orbitrap Fusion instruments.
In fact, the OneOmics project could be seen as a response to Thermo Fisher's acquisition of Life Technologies early this year and its next-gen sequencing business, although Stockton and Hudson both said this had little to do with the AB Sciex-Illumina initiative.
By bringing proteomics and next-gen sequencing technologies under the same roof, Thermo Fisher would seem to be in a promising position for developing and selling integrated workflows combining these types of omics data. The company has not yet made public any such efforts, however.