PHILADELPHIA — New methods of protein and peptide fractionation, top-down versus bottom-up proteomics, and new statistical methods for analyzing proteomic data were among the topics discussed here this week at Cambridge Healthtech Institute's Biomarker Discovery Summit.
Multiple lectin affinity columns, IPG strips, and MicroSol-Isoelectric Focusing were some of the less conventional approaches that were presented as ways of fractionating proteins and peptides before mass spec analysis.
William Hancock, the Bradstreet Chair in Bioanalytical Chemistry at the Barnett Institute of Chemical and Biological Analysis at Northeastern University (see Proteomics Pioneer), explained that multiple lectin affinity columns can be used to deplete plasma samples of immunoglobulins, albumin, and proteins that are not glycosylated.
"The glycosylated fraction is enriched very nicely," he said. "The non-bound fraction has a problem because it's still got albumin. So now you do an albumin depletion, which is relatively simple, and you're in good shape to do some quite deep plasma proteomics."
Jim Stephenson, the senior program director of mass spectrometry research at the Research Triangle Institute, presented Immobilized pH Gradient strips, or IPG strips, as a first-dimension separation technique that results in high-resolution isoelectric focusing.
"What this gives you is sort of an experimental way to set cut-offs. You can go in and say, 'I know I want to draw the line right here because I have an orthogonal technique.' If the peptides I find via my MS-MS search don't fit within this defined PI range, it's a false positive."
Stephenson noted that the technique is not only good for separation, but also good for checking protein identification results later.
"What this gives you is sort of an experimental way to set cut-offs," said Stephenson. "You can go in and say, 'I know I want to draw the line right here because I have an orthogonal technique.' If the peptides I find via my MS-MS search don't fit within this defined PI range, it's a false positive."
David Speicher, the director of the proteomics laboratory at the Wistar Institute, presented a four-dimensional separation strategy that allows for the detection of plasma proteins that range in concentration over nine orders of magnitude: The highest concentration proteins are present in milligram/ml ranges, while the lowest concentration proteins are present in picogram/ml ranges (see ProteoMonitor 4/8/05).
After first depleting the top six most abundant proteins, Speicher's group uses a technology called MicroSol-Isoelectric Focusing, or MicroSol-IEF, as the second dimension of their 4D technique. The technology separates proteins according to their pHs. Following pH separation, researchers then run fractions out of 1D electrophoresis gels. The gels are sliced up, digested with trypsin, and the resulting fractions are then run out on a non-capillary reverse high performance liquid chromatography column.
Speicher noted that out of the 18 laboratories that participated in the Human Proteome Organization's pilot Human Plamsa Proteome Project study, his laboratory — using the 4D technique — identified the most proteins overall.
Top-Down Versus Bottom-Up Proteomics
Maria Warren, the assistant director of the Michael Hooker Proteomics Core Facility at the University of North Carolina, noted that bottom-up proteomics is better in terms of sensitivity, while top-down proteomics gives better information on whether a protein has been modified, and the stoichiometry of a protein.
To show that the two methods can be complementary, Warren presented a case study where bottom-up proteomic methods were used to identify two phosphorylation sites on an RNA-binding protein in Drosophila called dSLBP. Top-down proteomic techniques were then used to deduce that two forms of the protein exist, and that the C-terminal end of the protein was phosphorylated.
"No matter what the technique you use, whether you're using top-down or bottom-up, you need to be sure that what you think you're identifying is what you really are identifying," noted Stephenson. "In top-down, the identification problem really doesn't exist to a large extent because you're in a scenario where you're looking at intact proteins. With bottom-up, you're looking at a lot of small peptides that can be common to a lot of different proteins."
Statistics in Proteomics
James Lyons-Weiler, an assistant professor in pathology and bioinformatics at the University of Pittsburgh Cancer Institute noted that powerful statistics can help reduce the number of patient samples that are needed for a study.
Lyons-Weiler pointed to a paper published by Mikel Aiken in the Journal of Alternative and Complementary Medicine in 2002 that showed that it was possible to reduce the number of patients needed to produce meaningful results in a study by an order of magnitude using a technique called multivariate balancing.
In addition, Lyons-Weiler described a new statistical approach called the "k of m" approach, where k is the number of times that a technique needs to be replicated and m is the number of disease markers.
To use the "k of m" approach, a certain percentage of samples must be set aside as a "learning set", and the rest of the samples make up the validation set.
"It looks like the statistical power of this approach is outstanding compared to the T-test," said Lyons-Weiler. "How many patients do you need to do a study? It depends on your method."
With better statistical methods, biological resources can be conserved, and researchers can better combine their discoveries to make effective panels of biomarkers, Lyons-Weiler noted.
"If we do not fund and systemize research informatics, much of the entire biomarker discovery and molecular profiling enterprise will have been wasted," he said.
— Tien-Shun Lee ([email protected])