NEW YORK (GenomeWeb) – A team led by researchers from the Ecole Polytechnique Fedérale de Lausanne (EPFL) and the mass spectrometry firm Spectroswiss have put forth a chemical digestion approach that could prove effective for middle-down proteomics experiments.
The method uses 2-nitro-5-thiobenzoic acid (NTCB) to digest proteins into peptides ranging from around four kDa to 10 kDa and averaging around 6 kDa, said Kristina Srzentíc, one of the developers of the workflow. Formerly a graduate student at EPFL, Srzentíc is currently a postdoc in the lab of Northwestern University researcher Neil Kelleher, who has also investigated middle-down proteomic approaches.
Middle-down proteomics aims to stake out ground between conventional bottom-up approaches and the less widespread but growing practice of top-down, or intact protein, proteomic analysis. The argument for the approach is that it would achieve some of the top-down community's goals of improving study of protein isoforms and post-translational modifications while avoiding the technical challenges inherent in measuring intact proteins on a proteome-wide scale. Larger peptides would also reduce the complexity of peptide digests being studied, which could improve analyses.
Thus far, however, the proteomics community has struggled to identify a digestion process capable of reproducibly producing peptides in the 4 kDa to 10 kDA range that is ideal for middle-down experiments. Trypsin digestion, which is commonly used in bottom-up workflows, typically produces peptides of around 1 kDa.
Researchers have pursued a variety of approaches. For instance, last year Jennifer Brodbelt, professor of chemistry at the University of Texas, published a method that combined carbamylation of proteins prior to trypsin digestion to increase the average size of digested peptides with ultraviolet photodissociation (UVPD) mass spec. Using the method in Escherichia coli samples, Brodbelt and her team generated peptides with an average size of 2.2 kDA, with 18 percent of generated peptides larger than 3 kDa and a small proportion in the 6 kDA and above range.
Prior to that, Utrecht University's Albert Heck published on efforts using both the proteases Asp-N and Glu-C, as well as a non-enzymatic method in which his lab incubated proteins at high temperatures in diluted formic acid. The protease-based method generated peptides with average size of 1.5 kDa, while the formic acid method generated average peptide sizes of 1.9 kDA.
Researchers including Kelleher have also looking into whether the outer membrane protease T, OmpT, might work for middle-down digestion. OmpT recognizes not one but two amino acids and will cut only at sites where both are present, which typically leads to production of longer peptides compared to trypsin. However, the enzyme is not entirely specific for di-basic sites, and not all proteins have di-basic sites, meaning experiments using it won't be able to achieve full proteome coverage.
In a May study in the Journal of Proteome Research, Srzentíc and her colleagues explored several chemical digestion approaches, including cyanogen bromide (CNBr), BNPS-Skatole (3-bromo-3-methyl-2-(2-nitrophenyl)sulfanylindole), o-iodosobenzoic acid and NTCB. The researchers used these chemical methods to digest a seven-protein mixture consisting of yeast enolase 1 and 2, bovine apo-transferrin, serum albumin, pancreatic ribonuclease A, chicken egg white lysozyme, and bovine carbonic anhydrase 2.
The researchers started with a small set of proteins to allow them to monitor them for expected and unexpected chemical shifts stemming from the digestion reactions, Srzentíc said.
"Even if you took the smallest proteome you could find, like 1,000 or 2,000 proteins, it would be almost impossible to [manually] check for all the chemical shifts that were introduced and that we reported in that paper," she said.
Additionally, the seven-protein mix included proteins without residues targeted by the chemical reactions used, which Srzentíc said allowed the researchers to see what these chemicals would do in the absence of their target residue.
"We wanted to see if there were any alternative reactions that took place in place of the one we were looking for," she said.
The experiment indicated that NTCB was the most promising of the chemical approaches. In NTCB digestion, proteins are cleaved at the amino-terminal end of cysteine residues, which, according to an in silico analysis of the human proteome would produce around 95,000 peptides ranging between 3 kDa and 15 kDa and allow for identification of around 19,000 proteins. A comparable bottom-up analysis using trypsin digestion would generate more than 550,000 peptides and identify nearly 20,000 proteins.
While the trypsin-based experiments have the potential to identify somewhat more peptides, the significantly fewer peptides generated in the NTCB-based experiments could make for less challenging LC separations and mass spec analyses, potentially allowing researchers to go deeper into the proteome.
"NTCB shows promise," said Yury Tsybin, senior author on the JPR study. Tsybin was formerly an assistant professor at EPFL and is currently the CEO and founder of Spectroswiss. "It produces peptides in the desired mass range and it doesn't produce many secondary cleavages or side products. It's quite clean, quite specific, and reproducible, so that is what we liked about it."
Tsybin added that in terms of cost and the time and work required, NTCD digestion was comparable to standard trypsin digestion.
However, he said, the lack of informatics tools tailored to NTCB digestion workflows is currently a bottleneck preventing further development of the technique.
"You need software where you can say, ok, find a mass shift that I don't know about," Srzentíc said. She noted that some top-down software packages like MS-Align, developed by Pavel Pevner and colleagues at the University of California, San Diego, are designed to work with unassigned mass shifts and might therefore be well-suited to analysis of NTCB-generated peptides. But, she added, such packages would need tweaks to adapt them specifically for that workflow.
"I think the software will come along," Srzentíc said. "It's not like this is a bottleneck that cannot be overcome. It's just something that hasn't been developed because there hasn't been much demand."
Until it is developed, proteome-wide experiments using NTCB digestion will be impractical, Tsybin said. "We're not able to go into larger proteomes because there are not the bioinformatics to support that. If we have a mixture of a couple proteins we can do things manually, but if you have a mixture of thousands of proteins there is no way you can do that manually."
Given that limitation, it is difficult to say how effective the approach will ultimately prove to be.
"In principle, from what we have seen, we can say it has potential," Tsybin said. "But it's not clear how it will perform in a large-scale experiment."
For instance, he noted that the researchers don't yet know what will happen in a sample with widely varying protein concentrations, as is the case for most biological samples.
"This is a chemical, not an enzyme," he said. "And when you mix a chemical with a protein, you need to have the proper ratio of chemical to the protein. So, say you have one order, two orders, three orders, 10 orders of magnitude variation in [protein] concentration across your sample. What will happen? That we still don't know."
Srzentíc noted that, in the absence of effective software, the NTCB approach might nonetheless prove useful for more targeted applications like analysis of therapeutic antibodies, where the ability to generate large peptides could bridge the bottom-up and top-down mass spec techniques currently used by many drug companies to characterize their biopharmaceutics.