ICAT, AQUA, SILAC, iTRAQ — we all know how scientists love to create acronyms to describe their experimental protocols, and so it is with labeling techniques for quantitative proteomics. There are countless acronyms describing protein and peptide labeling techniques for comparing protein expression between different samples using mass spectrometry. In fact, the number of isotope-tagging methods for quantitative proteomics has grown by leaps and bounds over the past few years since Ruedi Aebersold's group at the University of Washington first published on his ICAT — isotope-coded affinity tagging — reagents in 1999.
But now there's a new approach to quantitative proteomics in the works, and the influx of odd-sounding acronyms is sure to follow. Industry and academic researchers as well as mass spectrometry vendors are now promoting the idea that labels aren't required to compare protein expression between two samples in a single mass spectrometry experiment. Instead, these scientists say, for certain types of comparative protein expression experiments it may be enough to rely on the mass spectrometry data itself, given a high-resolution mass spectrometer and sophisticated instrument control and data analysis software.
This development isn't entirely new; for years researchers in both academia and industry have fiddled with these techniques for in-house use or as research projects to explore new experimental methods. The difference now is that soon these approaches may become much more straightforward to adopt, as mass spec vendors like Waters, Thermo Electron, and Bruker Daltonics begin optimizing their instrumentation and software for label-free quantitative proteomics applications. Most notably, at the annual meeting of the American Society for Mass Spectrometry this past May in Nashville, Waters described a new system for combining peak intensity data and exact mass measurements that promises to be the first commercially available protocol directed specifically at label-free quantitative proteomics.
Which isn't to say labeling technologies have lost their spot in the limelight. For many applications that require multiplexing — that is, comparing many samples to a control — tagging technologies are just starting to hit their groove. This past summer Applied Biosystems released a new reagent technology called iTRAQ that allows for multiplexing in quadruplicate, and Agilix, a Yale University spin off located in New Haven, Conn., described a new reagent technology that promises to allow multiplexing on the order of 20 to 30. Proteomics researchers say multiplexed quantitative proteomics experiments may prove particularly valuable in evaluating the significance of candidate protein biomarkers.
The bottom line is this: For protein biomarker discovery experiments requiring only a low degree of multiplexing, label-free approaches are likely to be sufficient — assuming the mass spectrometer is relatively high-end. But for validating candidate biomarkers and performing more complex experiments involving multiple samples and precise measurements of relative levels of expression, labeling reagents are still the way to go.
Of course, the ability to identify proteins en masse with mass spectrometry is relatively new, so it's certainly not earth-shattering news that quantifying their absolute or relative levels of expression with a mass spec is still a work in progress. Mass spectrometrists in the physical sciences have for years used isotopically-coded labels to differentiate between species in a single experiment, but it wasn't until Aebersold developed his ICAT reagents that researchers could systematically apply the approach to any proteome analysis problem. The challenge at the time was to develop chemistries for attaching a linker and tag to two sets of biological molecules, whereby one set of reagents contains an isotope of an element found in the other. This allows the researcher to compare the relative amounts of identical species derived from the different samples by searching for ions separated on the mass spectra by a mass difference attributable to the isotope.
In 2001, Applied Biosystems released a commercial version of the ICAT reagents to a warm reception. The accolades were deserved, because for the first time researchers could perform a standardized experiment to determine which proteins in a perturbed sample differed in expression level with respect to proteins obtained from a similar sample in its normal state. The method proved particularly adept at sniffing out putative biomarkers or drug targets in experiments comparing samples taken from disease versus normal tissue or serum, says Tony Hunt, senior marketing director for proteomics consumables at Applied Biosystems.
Over time, however, researchers began to appreciate the technical limitations to the ICAT reagent technology. First and foremost, the method allows for only relative protein expression measurements — that is, the experiment can provide a precise value for how much a particular protein is over- or underexpressed, but only in relation to the level of that same protein's expression in the normal state. Absolute measurements would be preferable, researchers say, as a means of pinpointing the exact concentration of the over- or underexpressed protein in the sample — results that would help researchers build quantitative models of protein interaction pathways and more easily compare experimental results across multiple laboratories.
There have been some attempts to remedy this shortfall, most notably in the lab of Steven Gygi, a former postdoc in Aebersold's lab who now holds an assistant professorship in the department of cell biology at Harvard Medical School. Gygi's group developed a procedure called AQUA (for "absolute quantification") that utilizes synthesized, istopically labeled peptides spiked into the sample of interest to serve as internal standards. The AQUA method, described in a paper published online in the Proceedings of the National Academy of Sciences in May 2003, has emerged as the most robust method for absolute protein quantification, say Aebersold and Leonard Foster, a postdoc in Matthias Mann's laboratory at the University of Southern Denmark.
The caveat is that AQUA requires the synthetic peptides to mimic precisely in structure the corresponding peptide of interest in the sample — aside from the isotopic label. It would be nice if one synthetic peptide could serve as a quantitative reference standard for all peptides in a sample, but variations in ionization efficiency within the mass spectrometer — and other factors — mean that a synthetic peptide is accurate as a standard only for the peptide it attempts to mimic. And because synthetic peptides are expensive, the AQUA method is only practical for absolute quantification experiments involving small numbers of proteins. "The limitation is to get these peptides," says Aebersold, "and synthesizing each one of these peptides can cost between $200 and $400."
Another complaint researchers began to voice about the ICAT reagents was that they don't label all the peptides in a sample. By design, the ICAT reagents only bond with cysteine-containing peptides, and in mammalian proteomes about seven percent of the proteins do not contain the amino acid cysteine (in practice, however, that number is actually about 11 percent, because some proteins fragment into cysteine-containing peptides that are too short to be sufficiently distinctive), says Dick Smith, a proteomics researcher at Pacific Northwest National Laboratory in Richland, Washington (see sidebar).
Thus, researchers worried they were missing potentially interesting proteins in their experiments. Of course, there's always a silver lining: by virtue of only labeling cysteine-containing peptides, the ICAT reagents effectively reduced the complexity of a protein sample, making the corresponding analysis by mass spectrometry much easier to deconvolve.
To address concerns about undersampling as well as allow for greater multiplexing, scientists at Applied Biosystems developed the iTRAQ reagents, which hit the market this past spring. This class of tagging chemistries labels amine end groups, found in every peptide, and allows researchers to analyze three different samples plus a control in the same experiment.
Steve Carr, director of proteomics at the Broad Institute and a former senior scientist at Millennium Pharmaceuticals, says the iTRAQ reagents seem particularly suited for multiplexed experiments in which the researcher knows what proteins to look for, and can thereby simplify the analysis of a complex mixture of proteins and improve sensitivity in multiple sample comparisons. "Where iTRAQ in my view has its greatest benefit is in the verification or validation phase rather than the discovery mode, except for relatively simple samples," he says.
"Where it's clearly going to be of use is doing multiplexed, targeted analysis," Carr adds. "For example, if you're monitoring the levels of four candidate proteins, or a single protein across four samples, you can pick peptides derived from those candidate proteins, label several samples, mix them together, and monitor each of those signature peptides. Since you know what the molecular ions are that you're interested in, you can go in and specifically select and fragment those masses. For that kind of experiment it clearly is useful."
Agilix, a five-year-old startup out of Paul Lizardi's lab at Yale University, has taken the multiplexing to another level. At the ASMS meeting last May, company officials claimed the technology could generate labels for multiplexing up to 29 samples at once, and that by further tweaking the linker and tag structure they could generate at least another 30 distinct tags. "We haven't yet found people who can make use of 100 labels, but 100 labels is easily within our scope," Darin Latimer, Agilix's vice president of engineering, told GT's sister publication ProteoMonitor. Agilix officials weren't available for comment in time for press.
In the company's ASMS presentation, Agilix said the labels, called i-PROT, could be designed to bind with various amino acids and could be used to label proteins either pre- or post-tryptic digestion, the process that breaks the proteins into their constituent peptides. The labels, which Agilix developed in collaboration with Brian Chait of Rockefeller University, are made up of polyglycine backbones with cleavable aspartic acid-proline bonds. Company officials at ASMS also said the labeling system is compatible with any tandem mass spectrometer — Agilix has primarily used MALDI instruments — and that they have developed software uniquely designed to analyze the resulting mass spectrometry data.
No Label, No Problem
Above all other issues with the ICAT reagent technology — or any labeling technology for that matter — is the cost of purchasing the various chemistries. With a 10-assay kit of the ICAT reagents running at $1450, and a multiplex iTRAQ reagent kit at $995 for 5 four-plex assays, many scientists, those in academia especially, are likely to balk at investing heavily in labeling chemistries. Therein lies the potential advantage of label-free approaches to protein quanitification. Over the last few years various groups, including Carr's lab at Millennium Pharmaceuticals, Gygi's at Harvard, and Stan Hefta's at Bristol-Myers Squibb, have developed their own approaches to divining relative protein expression levels solely from liquid chromatography and mass spectrometry data. Now, these efforts are starting to make the jump from one-off system to commercialized product.
LC and mass spectrometry system vendor Waters is gearing up to be the first to mass market a complete system and protocol for sidestepping the labeling technologies. Developed in collaboration with researchers in Hefta's lab at BMS, the system, called the Protein Expression System, relies on a high-end LC system combined with Waters' Q-TOF Premier, the company's most sophisticated quadrupole time-of-flight mass spectrometer. The trick to making the setup efficiently identify and measure the relative amounts of the various proteins found in a sample within the same run, according to Mark McDowall, Waters' marketing director for mass spectrometry, lies in the instrument control protocol, exact mass measurements, and specialized software for deconvolving the mass spectral data that tumbles out the other end.
To ensure that the mass spectrometer does not overlook potentially interesting proteins by undersampling the tryptic peptides eluting off the LC system, the experiment is designed so that the Q-TOF instrument alternates between low and high energies in the ion collision cell. In this way, Waters explains, data collected from the low energy collision corresponds to normal pseudo-molecular ions — analogous to the parent ions in a traditional data-dependent analysis experiment — and the data collected from the high energy collision represent their associated ion fragments. In other words, instead of having to specifically select precursor ions representing peptides the researcher thinks might be of interest, the rapidly oscillating collision scheme allows the mass spectrometer to catch and record fragmentation data from all the peptides entering the mass spectrometer without bias, McDowall says. "You can see everything," he adds. "You can even come back to the data and re-interrogate it, because you know you haven't missed anything."
McDowall says the system's other selling point involves the informatics module designed to deconvolve the overlapping fragmentation data produced by the instrument's rapid duty cycle. By taking into account retention time data collected from the LC end of the experiment and combining that information with algorithms for predicting the identity of the proteins that produced the peptide fragments, McDowall says researchers can obtain both protein identifications and their relative amounts in the same run given the proper use of an internal standard. "In our system all the fragment ions overlap [in the collision cell], but we've got some clever informatics to clear it all up," he says.
Although Waters hasn't begun shipping the Protein Expression System, McDowall says the company has begun working with proteomics researchers at GlaxoSmithKline to beta test the setup, and has plans to be "shrink-wrapping" and shipping the system to customers by the end of the year.
Waters isn't the only mass spec vendor attempting to standardize a label-free approach to quantitative proteomics. At Thermo Electron, scientists have developed an LC/MS/MS experimental protocol compatible with the company's ion trap mass spectrometers that allows researchers to correlate mass spec peak intensity data with a protein's relative concentration in a sample, says Ken Miller, a mass spectrometry marketing manager at Thermo. Researchers at the company have already demonstrated the experiment's feasibility, Miller says, and are now working with customers to implement the quantitative proteomics strategy.
At press time, Thermo had not decided whether to market an instrument and informatics system specifically designed for this type of experiment or whether to incorporate the capability into previously released software. "We're now beyond the feasibility stage," Miller says, "and now we're trying to determine how best to get that application out into the market."
Not to be outdone, other mass spec vendors are also integrating label-free approaches to quantitative proteomics into the software they sell to support their mass spectrometers. SpectrumMill, a mass spec data analysis program developed by Karl Clauser in Carr's lab at Millennium Pharmaceuticals and now sold by Agilent Technologies, contains this functionality, and Bruker Daltonics expects the next version of its Proteineer data analysis suite to include the ability to handle label-free quantitative proteomics experiments.
In many ways, however, there are clear differentiators between labeling and non-labeling techniques for measuring protein expression with mass spectrometry. Label-free approaches that rely on LC and mass spectrometry data currently offer a reasonably acceptable first pass when searching for over- or underexpressed proteins in an experiment, says the Broad Institute's Carr, while new labeling chemistries offer higher precision and multiplexing capabilities well-suited for differentiating between candidate biomarkers to determine which could be the most viable in the clinic.
Ultimately, the challenge will be to figure out which of the small (and large) changes in protein expression are truly biologically meaningful, a task that will require more time-consuming methods, including animal knockout models. "Determining which of the subtle changes are meaningful is hard, and the experiments are much lower throughput," Carr says. "But doing proteomics without biological validation is pointless."
Label vs. No Label: Why Choose?
At Pacific Northwest National Lab, Dick Smith has decided to take an agnostic approach to choosing between label and label-free approaches to quantitative proteomics. In fact, some of his experiments attempt to compare the relative amounts of proteins between two or more samples using both tagging chemistries and stand alone spectrometry data. "The two approaches don't have to be mutually exclusive," he says. "Why not do both simultaneously?"
Of course it helps to have the technical and mass spectrometry resources available to his lab, one of the premier protein mass spectrometry operations in the country. When it comes to tagging technologies, Smith prefers an approach called metabolic labeling, which involves culturing cell lines in the presence of amino acids labeled with 15N isotopes (see sidebar). But with tissue samples, Smith goes with reagents that incorporate the 18O isotope, a set of chemistries advocated by Catherine Fenselau at the University of Maryland that have the advantage of experiencing minimal side reactions — and they're certainly much cheaper than commercially available reagents, Smith adds. The only minor disadvantage, he says, is that for high molecular weight fragments the small mass difference between peptide ions tagged with isotopically and non-isotopically labeled reagents can make overlapping peaks in the spectrometry data difficult to distinguish.
Smith says label-free approaches are attractive because they avoid any potential issues with sometimes chemically overactive reagents, and can often be quite useful in biomarker discovery applications, where it's more important to find relatively large changes in protein expression between samples than to precisely measure the extent to which a protein is over- or underexpressed. In practice, Smith says, "we've been looking at comparing the two [approaches] quite extensively in the last year, and qualitatively the agreement is usually quite good; it's the quantitative agreement that's not always there. That's why I've become a real advocate of doing both."
The Ideal Reagent Technology: Metabolic Labeling
Lower costs and irritating side reactions are beginning to motivate researchers to try label-free approaches, but these same factors also highlight the advantages of an older labeling technology known as metabolic labeling. Although only feasible when studying cell lines developed in culture, metabolic labeling has emerged as a robust and relatively cheap alternative to labeling with the the ICAT and iTRAQ reagents, or to labeling with the 18O tags used in the labs of PNNL's Dick Smith (see sidebar) and Catherine Fenselau at the University of Maryland, among others.
The reason is that metabolic labeling involves culturing cell lines in the presence of 15N-tagged essential amino acids — those that the cell itself cannot produce and must incorporate from an external source. One of the more popular approaches to metabolic labeling, called SILAC for stable isotope labeling by amino acids in cell culture, was developed in the lab of Matthias Mann at the University of Southern Denmark. Leonard Foster, a postdoc in Mann's lab who works with the technology, says that as the cells grow, they grab the tagged amino acid to make proteins. Given sufficient amounts of tagged residues, all the proteins in a cultured sample will pick up the coded tag. Then, when the tagged sample is combined and compared with the control, the biochemical fractionation, tryptic digestion, and other work-up steps required to produce the peptide fragments occur after the tags have been incorporated. That reduces the potential for troublesome side reactions and eliminates any potential variations in sample treatment that might occur if the work-up had to occur separately — as is the case with most commercially-available reagent technologies.