Skip to main content
Premium Trial:

Request an Annual Quote

With Reliable Repository Options, MCP Reinstates Raw Mass Spec Data Requirements


NEW YORK(GenomeWeb) – Leading proteomics journal Molecular & Cellular Proteomics has revised its guidelines to require that researchers submit raw data for any submissions involving the use of mass spec.

The move, which MCP editors announced in an editorial published last week, reflects the improved state of repositories for raw mass spec data and the field's increased confidence in the stability of these resources.

Storage of raw mass spec proteomics data has long been an issue for the field, and one that has grown more significant with the rise in production of such data due to more powerful instrumentation, higher-throughput experiments, and increased adoption of shotgun mass spec methods.

Submission of raw data allows outside researchers to more thoroughly assess the accuracy of large mass spec experiments. It also allows for re-analysis of past experiments using different or novel informatics approaches, which could enable discoveries not made by the group that initially generated the data. Additionally, because many proteomics studies are focused on a relatively small aspect of the data generated — protein fold changes or phosphorylation, for instance — deposition of raw data allows other researchers to investigate it from different angles.

MCP had previously required that authors submit their raw data, however, this requirement proved unsustainable when the University of Michigan-based Tranche repository — at the time the only resource for hosting large raw mass spec data sets — ran into funding trouble and began cutting back its activities in 2010.

Since then, several other resources have stepped in to fill the gap, most notably the European Bioinformatics Institute's Proteomics Identifications Database, PRIDE, and the Universityof California, San Diego's Massive repository. The addition of these resources gave the journal the confidence to reinstitute its requirement of raw data submission, Robert Chalkley, a University of California, San Francisco researcher and MCP's data management editor, told GenomeWeb. 

"We're confident that there are now multiple repositories able to handle this, so we're not going to have a similar situation as before when one repository failed we had to change our guidelines," he said.

The new policy is slated to go into effect on July 1. To date, the journal has received essentially no pushback from authors, Chalkley said, noting that "practically everyone does it anyway."

In an email to GenomeWeb, William Hancock, a researcher at Northeastern University and editor-in-chief of the Journal of Proteome Research, likewise noted that that publication has had success in getting authors to submit their raw data. He said that for the last 18 months the journal has been "strongly" encouraging authors to submit raw data, adding that while the journal was not planning to formally mandate raw data submission, it was "effectively moving to that point with author education and reviewer involvement."

Chalkley noted that the raw data requirement was aimed primarily at shotgun mass spec experiments. MCP has guidelines for submission of targeted data such as that generated by multiple-reaction monitoring assays, but, he said, these are less fleshed out than the guidelines for shotgun data — likewise with experiments using data-independent acquisition methods like Swath, Chalkley said.

"Quite a lot of the DIA is quantitative stuff and is analyzed in similar sorts of ways to MRM, so it's a similar situation," he said. "We have updated the instructions to say these are the sorts of things you should be supplying, but it's not formally part of the guidelines."

He added that MCP has plans to introduce new data guidelines for targeted experiments.

Chalkley suggested as well that some non-specialist journals might also begin requiring submission of raw data.

He noted that such journals typically don't have the proteomics expertise on staff to insist that authors of proteomics studies comply with a range of data quality requirements before their papers reach reviewers, but that with regard to "submitting raw data they may follow suit, because it is very easy to do and they are supportive of these concepts as well."

In addition to the requirements for raw data submission, the MCP also reiterated its requirement that researchers provide annotated spectra for "all MS/MS protein identifications based on only a single unique peptide" as well as all post-translational modification assignments to allow reviewers and others to easily assess the quality of these identifications.

Chalkley said that this is, in fact, a more challenging requirement to comply with than raw data submission, largely due to the fact that many researchers are not familiar with how to generate these annotated spectra for review within the search engines they are using.

As he and his fellow editors noted in their editorial, this led to many authors "meeting this compliance requirement through the production of one large pdf file of static screenshots of spectra," which "made finding relevant spectra and more than cursory examination of the assignment difficult, and reanalysis of data impossible."

The journal also put in place a new requirement asking authors of quantitative studies to clearly state the key aspects of their experimental design in a single, separate paragraph.

"As people are doing more clinical studies, it's become more important that they think about this sort of thing," Chalkley said. The requirement, he noted, is not so much to ensure that the authors followed a valid experimental design as to ensure that the information necessary to evaluate their design is easily available. Actual determination of the validity of the design will still be left to the journal's reviewers.

"The primary thing we are doing is trying to get a clear paragraph into the manuscript that explains exactly what they did so the reviewer can easily understand that, for instance, there were two biological repeats and two technical repeats," Chalkley said. "We have had a lot of problems in the past where it has been sort of unclear what [study authors] have done."