NEW YORK (GenomeWeb) – Researchers at the Max-Planck Institute of Biochemistry have devised a method for determining protein copy number in mass spec experiments based on the levels of histone proteins in a sample.
Detailed in a paper published last month in Molecular and Cellular Proteomics, the approach relies on the fact that histones are present in cells in a fixed proportion to DNA, which is present in a fixed amount per cell. Based on these relationships, the researchers were able to estimate per cell protein copy numbers using only the mass spec signal of histones in a sample.
The approach matches the precision of more laborious methods of estimating protein copy numbers, such as spiking in isotope-labeled reference standards, and also provides information on cell size and protein concentration, Jacek Wiśniewski, a Max Planck researcher and first author on the study, told ProteoMonitor.
The knowledge underlying the method is not new, Wiśniewski said, calling it the result of a "kind of scientific evolution in thinking."
For instance, he noted, it has been known since the 1970s that DNA and histones are present in cells in a fixed ratio. Additionally, in 2012, Wiśniewski and his colleagues presented what they termed their Total Protein Approach to determining per-cell protein copy numbers, which uses the mass spec intensities of the measured proteins combined with cell size estimations based on measurements of total DNA content.
This, he said, led to the thought that, instead of measuring DNA content to estimate cell size, they could simply use the intensities of the signals from histone peptides.
"So finally, [in the MCP paper] we combined this knowledge [that DNA could be used as a constant] with the knowledge that the ratio of histones to DNA in a cell is [fixed]," he said.
Knowing per-cell protein copy number can be desirable for systems biology experiments and clinical measurements However, Wiśniewski noted, conventional methods of determining these copy numbers face a number of challenges.
For instance, approaches using spiked-in labeled peptide standards require accurate measurements of total protein concentration, which, Wiśniewski said, can vary widely depending on the assay used.
"You see different assays giving you totally different amounts of protein," he said.
Additionally, establishing protein copy numbers requires knowing the total number of cells present, which can be determined by cell counting or calculated using measurements of the total protein per cell and cell volume. However, as the MCP authors note, "cells are not necessarily uniform" and "a 25 percent variation of the diameter of a sphere-shaped cell corresponds to [a] two-fold change in cell volume."
Counting cells visually is also challenging, they said, noting that, for example, "up to five-fold differences in calculated cell volumes have been reported for enterocytes of the intestinal mucosa."
These potential sources of variation make determining per-cell protein copy a difficult endeavor. Indeed, Wiśniewski said, variation of two-fold or greater is typical for the process.
The group's histone-based approach does not particularly improve on the precision of these approaches, he noted. But, he said, it provides equivalent or slightly better performance with a much simpler workflow.
Indeed, because the technique relies on the signal of histones measured during the larger mass spec analysis, it doesn't require additional steps or reagents. "You can estimate the [copy number] of the proteins in your dataset for free," Wiśniewski said.
One concern with the method, he noted, was that because histones are heavily modified, some of these modified peptides could be difficult to identify, which could affect the ratio of the histone mass spec signal to the overall protein mass spec signal – the key to the technique.
To investigate this potential problem, the researchers searched their data with the peptide search engine set to look for different combinations of modifications. This allowed them to see how the histone mass spec signal changed depending on which modifications were identified. Ultimately, they found that while individual histones changed in their relative abundance levels, the ratio of total histone signal to overall protein signal changed only by 5 percent to 10 percent.
The accuracy of the approach depends on the size of the dataset, Wiśniewski said, noting that the more peptides identified, the more accurate the measures of the histone and total protein signals.
In part, this is due to the fact that histones contribute some of the most intense peptides, meaning that in smaller datasets the histone signal will be overestimated. However, this is not a problem in datasets of around 14,000 peptides or larger, Wiśniewski said.
He also noted that the more peptides in a dataset for a protein, the more accurate the copy number estimation will be.
"The more abundant the proteins, the more accurate the values," he said. "When a protein is giving you only a few peptides, the number will be skewed."
"But," he added, "I think that for upwards of 70 percent of proteins in big data sets you can [confidently] say this is the copy number."