NEW YORK (GenomeWeb) – A team led by researchers from the University of Michigan has developed a new informatics package for analysis of data independent acquisition mass spec.
The method, named DIA-Umpire, allows users to generate pseudo-tandem MS spectra from DIA data, allowing for conventional database searching and the generation of spectral libraries without the need for a separate data-dependent acquisition run.
Detailed in a paper published today in Nature Methods, the software offers a versatile and streamlined approach to DIA data analysis, Alexey Nesvizhskii, a University of Michigan researcher and senior author on the paper, told GenomeWeb.
Enabled by advances in instrumentation and software, DIA mass spec has emerged in recent years as a popular complement and alternative to conventional DDA workflows.
In DDA mass spec, the instrument performs an initial scan of precursor ions entering the instrument and selects a sampling of those ions for fragmentation and generation of MS/MS spectra. Because instruments can't scan quickly enough to acquire all the precursors entering at a given moment, many ions — particularly low-abundance ions — are never selected for MS/MS fragmentation and so are not detected.
In DIA, on the other hand, the mass spec selects broad m/z windows and fragments all precursors in that window, allowing the machine to collect MS/MS spectra on all ions in a sample.
Use of broad m/z windows, however, present a challenge for DIA analysis in that they result in very complicated spectra with considerable noise as the precursors captured in these windows interfere with one another. To get around this, DIA analyses have typically employed a targeted approach to identifying and quantifying peptides akin to the process used in multiple-reaction monitoring mass spec.
This approach requires that researchers first generate a spectral library for their sample using a conventional DDA run. They can then search data from subsequent DIA runs in a targeted manner against this spectral library.
DIA-Umpire, on the other hand, uses m/z and retention times to detect and match precursor and fragment ion levels in DIA MS1 and MS2 level data and then uses these groupings to generate pseudo-MS/MS spectra that can be searched using conventional database search engines as is commonly done in DDA experiments.
The software also allows users to generate spectral libraries from these pseudo-MS/MS spectra, enabling targeted DIA-style searching, as well.
Given the "multiplex fragmentation and very complex spectra" involved in DIA, everyone knew that having untargeted analysis of such datasets using conventional database searching wasn't a good idea," Nesvizhskii said of the initial forays into DIA data analysis.
However, now that targeted DIA analyses are well established, "I think this untargeted analysis is the natural second step, to see what with improved signal processing and improved algorithms we can actually get out of those datasets without relying on spectral libraries," he said.
"I think the key for us was to demonstrate that without relying on targeted analysis of DIA data we can get results, and to look at how one can combine analyses in this kind of hybrid workflow," he added.
The first portion of the workflow in which precursor and fragment ions are matched using m/z and elution times is similar to Waters' MSE DIA approach, which likewise uses elution times to match precursor ions to fragment ions, allowing them to be searched against conventional databases as in DDA mass spec.
In the Waters approach, however, the instrument fragments the entire mass range instead of cycling through wide m/z windows, making for even more convoluted spectra. Additionally, the Waters approach does not then let researchers generate spectral libraries for targeted DIA searches — although researchers at Johannes-Gutenberg University Mainz are currently working on developing a version of the Waters approach that would allow for spectral library-based searching.
As JGU researcher and leader of this effort Stefan Tenzer told GenomeWeb in an interview last year, using spectral libraries as part of the Waters approach would allow the researchers to do quantitation on fragment ions instead of precursor ions, which could improve the method's dynamic range.
"I think the spectral library-based approaches have an advantage with lower-[abundance] signals because they are looking at multiple fragments at one time and not just one ion," he said. "So we could be able to identify more [proteins] and more reliably in the lower orders of magnitude."
Speaking to GenomeWeb last week, Swiss Federal Institute of Technology Zurich researcher Ruedi Aebersold noted that in targeted workflows like MRM, data at the MS2 fragment ion level is typically less noisy than the MS1 precursor level data, making it possible to detect and quantify the fragment ions of precursors that were not detectable in MS1.
Because DIA-Umpire looks at both MS1 and MS2 level data, the approach could allow researchers to analyze the extent to which this is true of Swath style DIA experiments, as well, said Aebersold, who was not involved in development of the tool.
It is also possible, he said, that the new approach could find data at the MS1 level not apparent at the MS2 level. Traditional Swath-style analysis — which was largely developed in Aebersold's lab — relies solely on MS2 data for peptide detection and quantification. It could also be the case, Aebersold noted, that applying conventional DDA-style database searching to DIA datasets could identify peptides not detected via spectral library searching.
Generally speaking, though, Aebersold said he would be somewhat surprised if the method identified significantly more or different peptides than existing DIA approaches.
"Right now for the kind of targeted [DIA analysis] strategy that we introduced, there are basically three software tools that are out there – OpenSwath [from Aebersold's lab], Spectronaut from [proteomics firm] Biognosys, and PeakView from AB Sciex," he said. "The encouraging thing is that they all find roughly the same number and type of peptides. And so what I think that means is there is probably not a whole lot of [additional] stuff that can be found in this data."
"But if [DIA-Umpire] did [discover more], it would be interesting to see why," Aebersold added.
Nesvizhskii said that in initial work with the software, the untargeted and targeted searches appeared to provide complementary information, suggesting that applying both to the same sample will result in more peptides and proteins identified.