NEW YORK (GenomeWeb) – Researchers from the Broad Institute and the University of Washington have developed software that could improve the performance of data-independent acquisition (DIA) mass spectrometry experiments.
Detailed in a paper published this week in Nature Methods, the software uses linear algebra to analyze the convoluted spectra generated by DIA mass spec and, according to its developers, boosts the sensitivity and accuracy of DIA analyses.
DIA mass spec selects broad m/z windows and fragments all precursors in that window, which allows the instrument to collect MS/MS spectra on all the ions in a sample. This means that, unlike in conventional data-dependent acquisition (DDA) mass spec experiments, the instrument is looking at the same peptides in every sample, making for more reproducible quantitation.
Use of broad m/z windows, however, presents a challenge in that they result in very complicated spectra with considerable noise as the precursors captured in these windows interfere with one another. The complexity of these spectra means DIA analyses typically measure less of the proteome than equivalent DDA experiments and has also limited the effectiveness of DIA methods for distinguishing between highly similar peptides — for instance, molecules featuring single amino acid variations or alternative post-translational modification site localizations.
"DIA is a very promising method in that we hope that it will allow us to be much more comprehensive in detecting peptides in proteomics samples," said Jacob Jaffe, associate director of the Proteomics Platform at the Broad and senior author on the Nature Methods study. "The challenge has been interpretation of the [DIA] spectra, because they're super complicated."
DIA analysis tools have adopted a variety of strategies for deconvoluting the complex spectra generated by the technique. One common approach is to compare different characteristics of experimental MS2 spectra to spectra in a previously generated spectral library. However, Jaffe and his co-authors wrote, these methods are not generally able to "rigorously account for the confounding effects of precursor cofragmentation," which makes it difficult to distinguish between highly similar peptides.
Named Specter, the approach developed by Jaffe and his colleagues tackles the complexity of DIA spectra by using linear algebra to determine what combination of spectra best explains the ions comprising the mixed spectra generated by a DIA run.
The underlying notion is "that for any given MS2 spectrum, all of the [spectral] library precursors that fall into the window for that spectrum can potentially play a role in the analysis," said Ryan Peckner, first author on the study and a post-doc in Jaffe's lab. "The point of the linear algebra is to ask, out of all the potential precursors, what is the single combination [of precursors] that best explains the MS2 spectrum that we've observed?"
"That means you can start with a single MS2 spectrum and ask, what are the precursors in our libraries that constitute that spectrum? Rather than taking the approach that I think has been prevalent up until now, which has been to look for correlations between the chromatograms of pre-selected fragment ions," he added.
One advantage of the linear algebra-based approach is an intrinsically low false discovery rate, Peckner noted.
"There are mathematical principles that say how unique you expect the solutions to systems of linear equations to be," he said. "And in situations like this one, where your systems of equations are over-determined, which, in very concrete terms means that there are more fragments in these MS2 spectra than there are library members that could possibly belong to those spectra, in that situation in linear algebra, you always have a unique optimal solution to the system at hand."
This, in turn, makes the approach effective at distinguishing between highly similar peptides, giving researchers "access to a level of sensitivity that wasn't possible before with DIA," Peckner said.
In the Nature Methods study the researchers used Specter to distinguish between three groups of synthetic peptides distinguished by either a single amino acid or the transposition of two adjacent amino acids. Spiking random sets of these peptides into E. coli lysates that they then analyzed using DIA and Specter, they found they were able to distinguish between the peptides and correctly identify all but one in all experiments.
They also used the approach to analyze DIA phosphoproteomic data previously generated in an 84-run experiment looking at the effects of 28 kinase inhibitors on PC3 prostate cancer cells, finding that they were able to identify a substantial number of the 176 phosphopeptide positional isomers known to be present in that data set.
"Among the 176 sets of positional isomers in the spectral library, at least one member of each set was identified in 31 of the 84 runs on average, and both members were identified in 17 of the runs on average," the authors wrote.
They also demonstrated the potential biological relevance of such isomers by using Specter to investigate positional phospho-isomers of the protein plectin, which is a substrate of both CDK1 and MNK2, both of which are involved in cancer signaling pathways. By looking at changes in the ratio of plectin phospho-isomers in response to different drug treatments the researchers were able to establish that one isomer is likely due to phosphorylation by CDK1 while the other is likely due to phosphorylation by MNK2.
Using the DIA software benchmarking tool LFQbench the researchers compared the performance of Specter to other popular DIA software packages. Analyzing a publicly available dataset consisting of DIA runs looking at a mix of human, yeast, and E. coli proteomes, they found that Specter identified 4,733 proteins while the other tools identified between 4,518 and 4,692.
One potential trade-off is the high level of computing power required to run the method, Peckner said, noting that "ideally you're running it on a computing cluster."
He suggested this might be why similar approaches have not come to fruition, despite being proposed in contexts like metabolomics and gas chromatography.
"I think the concept, once it's been stated, is kind of obvious," he said. "The question is actually making it practicable."
To that end, the Specter software is designed to run in an Amazon cloud computing environment, making it widely accessible, Jaffe said.
"I do think that we're entering a point where, especially labs that are doing DIA are quite computationally advanced just because of the complexity of the software that's out there," Peckner said. "It's quite likely that they'll either have that infrastructure or will know how to set up some sort of cloud computing that will make it work."