Skip to main content
Premium Trial:

Request an Annual Quote

Tool Benchmarks Performance of Popular Mass Spec Software Packages for Data-Independent Acquisition


NEW YORK (GenomeWeb) – Researchers at the University of Mainz in Germany have developed a tool for benchmarking the performance of different data-independent acquisition mass spec software packages.

Called LFQbench, the tool measures the accuracy and precision of label-free quantitative mass spec experiments. In a paper published last week in Nature Biotechnology, the researchers used the tool to compare and optimize the performance of five commonly used DIA mass spec software programs: OpenSWATH, SWATH 2.0, Skyline, Spectronaut, and DIA-Umpire.

Looking at data generated on Sciex TripleTOF 5600 and TripleTOF 6600 instruments, they found that adjustments suggested by LFQbench provided improved performance compared to the standard software settings, Pedro Navarro, formerly a University of Mainz post-doc and the first author of the paper, told GenomeWeb. In particular, he noted, the tool helped improve the quantitative measurements of low-intensity proteins.

The study also provided insight into the path DIA software development might take in the future. Its results indicate, Navarro said, that the DIA-Umpire software, which differs significantly in its approach from the other packages, could especially benefit from improvements in instrument performance.

Since Sciex launched the first commercial Swath-style DIA method in 2011, such methods have become increasingly popular in proteomics work, with scientists attracted by their relative simplicity and ability to reproducibly quantify thousands of proteins across large numbers of samples.

DIA differs from conventional data-dependent acquisition mass spec. Instead of selecting a subset of available ions for fragmentation and generation of MS/MS spectra, DIA assays select broad m/z windows and fragment all precursors in that window, allowing the mass spec to collect MS/MS spectra on all ions in a sample.

Because DIA methods fragment all precursors, the same peptides are measured in each experiment. This allows for greater reproducibility than in DDA experiments, where the ions selected for fragmentation vary from experiment to experiment. However, fragmenting all the precursors creates highly complex spectra that must then be deconvoluted on the back end, and, as a result, DIA is typically less sensitive than DDA.

As Navarro and his co-authors wrote, "computational methods … critically affect the results of quantitative proteomics analysis." And, he said, though DIA is still a relatively young technology, numerous software packages exist for analyzing this data.

"We wanted to do an evaluation of the most important [Swath DIA] software tools so that we could actually know a little better how these software tools are working," he said.

He and his colleagues also wanted to develop a package that other researchers could use to benchmark and optimize their own software and instruments, he added.

To develop and test the LFQbench tool, the Mainz researchers collaborated with the developers of many of the most commonly used Swath analysis software packages. Using the tool, they measured proteome samples consisting of mixes of human, yeast, and Escherichia coli proteins on the TripleTOF 5600 and TripleTOF 6600 and then analyzed them in two iterations. The first used the most current versions of the packages being tested with settings optimized according to the developer's instructions, the second used the same software packages, but with adjustments suggested by the LFQbench analysis of data from the first iteration.

In the first iteration, the packages performed similarly, offering similar dynamic range and quantitative precision and accuracy, except for DIA-Umpire, which provided fewer quantitated proteins and lower reproducibility than the other approaches.

Using their LFQbench tool, the researchers identified several issues affecting the performance of the tested software — poor background compensation being among the most prominent problems. By incorporating these finding into their parameters, all of the packages were able to improve in the second iteration of analysis, Navarro said.

"We took the data and analyzed it as they recommend, with the general parameters as they have been optimized by the software developers," he said. "Then we evaluated the results of this and we provided feedback to the developers and told them by using our benchmarking tool which problems we thought their software had. So all of the developers made some changes to their software following our guidance, and in all of them it improved their results in the second iteration."

He suggested that LFQbench could be a broadly useful tool for researchers looking to benchmark and optimize their quantitative proteomics workflows. The Nature Biotechnology study looked only at data from Sciex TripleTOF instruments, but Navarro said the tool is applicable to data from any platform.

In addition to establishing the usefulness of the LFQbench process, the study also provided a look at the state of DIA analysis and where it might go in the future. Using the various approaches, the researchers were able to quantify around 5,000 proteins with high precision and accuracy, a level of performance that they wrote "is similar to other label-free quantification approaches."

The work also suggests that DIA-Umpire, while it generally underperformed in the more conventional Swath analysis approaches, could nonetheless prove the way of the future for DIA analysis, Navarro said.

In conventional DIA analyses, researchers first build a spectral library for their sample using a DDA run. They then do targeted searches of data from subsequent DIA runs against this spectral library.

This approach is necessary because the large m/z fragmentation windows used in DIA methods like Swath lead to highly complex spectra with considerable interference between the multiple precursors contained in each window.

DIA-Umpire, on the other hand, uses m/z and retention times to detect and match precursor and fragment ion levels in DIA MS1 and MS2 level data and then uses these groupings to generate pseudo-MS/MS spectra that can be searched using conventional database search engines, as is commonly done in DDA experiments.

This means researchers can perform untargeted searches of DIA data, allowing them to identify peptides not identified in targeted searches. Additionally, once generated, the pseudo-spectra can be used as a spectral library for targeted searching using traditional DIA informatics programs. It also allows researchers to skip the initial DDA run traditionally required for setting up a spectral library.

"At this moment, DIA-Umpire provides less identifications and quantifications than the other methods, but it provides a completely orthogonal method, and that means it can complete [datasets generated] from other methods," Navarro said. In the future, though, it could become a primary method of DIA analysis in and of itself, he added.

The reason for thinking this, Navarro said, is the large gains in performance DIA-Umpire saw when moving from the TripleTOF 5600 to the more powerful 6600.

"The 6600 has improved detectors and it has a better chromatography system," he said. "And the difference in the results from the 5600 to the 6600 with DIA-Umpire is stunning."

"So that says to us that at this moment, DIA-Umpire is very dependent on the data quality," Navarro added. And as instrument and data quality improves, this algorithm's performance should likewise see significant improvement.

"The other software tools work better at this moment," he said. "But it will not take more than two or three years until machines will be fast enough to apply DIA-Umpire [effectively], and then DIA-Umpire could be even better than the [traditional] approaches."

Navarro noted that conventional library-based DIA analysis would likely retain an advantage in sensitivity.

"So if, for instance, you really want to monitor a certain pathway and you really want to have all the proteins involved in this pathway, maybe the best way to do that will be to … get a good library on this pathway and analyze it with any software doing this kind of [conventional] DIA approach," he said.

"But if you want to have a discovery tool, and you really don't know what you are looking for, maybe better than restricting yourself to a library will be to use an approach like DIA-Umpire," he added. "Both will still be complementary, and it will depend a lot on what the user requires."