NEW YORK (GenomeWeb) – Researchers at the University of Oxford's Target Discovery Institute (TDI) have developed a new mass spec workflow, which they used to perform one of the deepest proteomic characterizations to date of a single cell line.
Described in a study published last week in the Journal of Proteome Research, the workflow, named CHOPIN (for CHarge Ordered Parallel Ion aNalysis) combines protein digestion by multiple enzymes with extensive fractionation and a parallel mass spec analysis enabled by the unique architecture of Thermo Fisher Scientific's Orbitrap Fusion Lumos instrument.
Due largely to the multiple enzyme digestion, the approach is not practical for quantitative work or high-throughput analyses, said Roman Fisher, senior scientist at TDI and senior author on the study. However, he noted, it allows for extremely deep proteomic coverage, making it potentially useful for discovery work that can then be followed up using more targeted, higher-throughput methods.
In the JPR paper, Fisher and his co-authors, which included researchers from Thermo Fisher, analyzed the MCF-7 breast cancer cell, identifying 13,728 proteins while achieving 57 percent median protein sequence coverage.
Both figures push the outer bounds of proteomic coverage achieved in a single experiment, Fisher said, noting that the high sequence coverage was key to the high number of proteins identified.
Typically, he said, proteomic experiments "struggle to get over, like, 40 percent median sequence coverage." And this, he noted, limits the different protein isoforms they are able to detect.
With their median 57 percent sequence coverage, Fisher and his colleagues were able to "distinguish between protein isoforms we usually are not able to."
"We identified close to 14,000 proteins in our data, and a quarter of those proteins were isoforms that you probably wouldn't be able to distinguish if you didn't have this high sequence coverage," he said.
Key to achieving this sequence coverage was the use of the enzyme elastase, in addition to trypsin, for protein digestion. Trypsin, which is the most common enzyme used for proteomic experiments, cuts at specific amino acids, allowing researchers to search only for peptides digested at these points when making their identifications.
Elastase has a different, and broader, specificity than trypsin, meaning that a combined trypsin, elastase digest will create a collection of peptides different from that generated by trypsin alone. As the TDI researchers found, this increases proteome and sequence coverage. However, it also significantly increases the complexity of the sample digest and the size of the search space when making identifications.
In fact, data analysis proved the limiting step for the workflow, Fisher said. Sample prep took around two days followed by LC-MS analysis, for which they ran 45 fractions using one-hour gradients. However, he said, "the search of the elastase data takes quite a long time. If you use, for instance, three medium-end work stations, it's going to take like three weeks just data crunching."
For this reason, use of elastase has largely been limited to digestion of limited samples of small numbers of proteins for which researchers are aiming to get high levels of sequence coverage, Fisher said.
"Nobody really has tried to use it on something like a full cell digest, because as you can imagine the complexity of that sample as compared to [a standard trypsin digest] is insane," he said.
Use of elastase also makes it difficult, if not impossible, to do reliable quantitative work, due to the inability to achieve reproducible digestion, Fisher said. For good quantitation across samples "the digest has to have happened in exactly the same way, and I'm not 100 percent sure that can be guaranteed [with elastase]," he said.
And while quantitation is key to, for instance, protein biomarker studies and other common proteomic analyses, Fisher suggested that there are benefits to deeper coverage even in the absence of good quantitative data.
"I think it becomes really interesting if you really want to pull apart isoforms, different proteoforms," he said. "In our data we found about 200,000 different modifications on the whole proteome and more than 200 different [post-translational modification] types, as well, which is something you would never really see in [a typical] deep proteome data set. So, if you want to really have a detailed look at the sample, not quantitation, but mapping, I think then elastase is going to be the best choice at the moment."
In addition to the dual-enzyme digestion strategy, the workflow made use of Thermo Fisher's Orbitrap Fusion Lumos and its Orbitrap and ion trap analyzer to increase proteome coverage.
Essentially, the method diverts different precursor ions to either the Orbitrap or ion trap depending on their characteristics, allowing analyses of these different populations to proceed in parallel, increasing the instrument's analysis time and overall sampling capacity.
"We're using a sort of intelligent data dependent acquisition to select the scan mode that is most appropriate for a given precursor ion," Fisher said.
Applying the approach, high abundance precursors with high charge states undergo high collision energy dissociation fragmentation and are then analyzed using the Orbitrap. Meanwhile, lower charged, lower abundance ions are fragmented using collision induced dissociation and sent to the higher sensitivity ion trap.
Typically, the instrument's Orbitrap is used for a quick MS1 scan, with fragment ions diverted to the ion trap for MS2 analysis, Fisher said. "However, while these [MS2 scans] happen, the Orbitrap isn't doing anything, so, it's basically a waste of time. So, we add the additional [Orbitrap] scan.
"It's a very straightforward data decision tree method that everyone can just edit and add to their method portfolio," he said, adding that the method files are included as part of the JPR manuscript to allow outside groups to easily add them to their workflows.