Researchers at Australia's University of Newcastle have published one of the first analyses of data from the Alzheimer's Disease Neuroimaging Initiative's plasma proteome project.
Detailed in a paper in the current edition of PLoS One, the analysis identified a number of potentially diagnostic protein signatures, including a ten-analyte panel that distinguished between controls and mild cognitive impairment patients who progressed to Alzheimer's with sensitivity and specificity of roughly 90 percent.
Among the most interesting results of the analysis, which looked at data generated on 190 plasma markers in 566 individuals with MCI, Alzheimer's, or normal cognition, was the increased accuracy provided by longitudinal data collected through the study, said Pablo Moscato, a Newcastle researcher and author on the paper.
Using Myriad RBM's DiscoveryMap immunoassay platform, the ADNI study collected measurements on the 566 subjects at the beginning of the project and then one year later. This longitudinal data, Moscato told ProteoMonitor, proved particularly useful in his team's analysis.
"The big news for us from this study is the longitudinal part," he said. "We did several [types of analyses] – analyzing analytes by themselves, the analytes at one point in time, the longitudinal analysis, and I think that what really paid off the most was to use the longitudinal information."
Such information, Moscato noted, can help account for the biological and experimental variability that is inherent in biomarker research and that has proven a significant challenge for the field.
"One of the problems [with biomarker studies] is that when somebody has the first symptoms, they don't necessarily [present] at the same moment in the progression of the disease," he said. "Also, you have to consider that the disease is taking place on top of the normal aging process of the brain, where there is naturally loss of function.
"So people will [enter a study] at different ages and at different stages of the disease, and that is why there's an intrinsic problem with building a test on [samples collected at] just one moment in time," he added. "Probably the future [of biomarker research] is to look at longitudinal changes in panels of proteins."
The authors noted in the paper that data from an even wider range of time points would be desirable given that a single year might be "of insufficient duration to detect substantial clinical or proteomic differences."
In an interview with ProteoMonitor following the initial release of the ADNI dataset in 2010, Holly Soares, director of clinical neuroscience biomarkers at Bristol-Myers Squibb and chair of ADNI's Biomarker Consortium project team, said the organization might consider a second study measuring proteins at additional time points if results from this study suggested such an approach could be promising (PM 12/10/2010).
In their analysis of the ADNI study, the Newcastle researchers built an eight-analyte signature using cross-sectional data that distinguished between controls and mild cognitive impairment patients who progressed to Alzheimer's with sensitivity and specificity of roughly 85 percent. Using the longitudinal data, they were able to build a panel that raised sensitivity and specificity to above 90 percent.
Another key aspect of their analysis, Moscato said, was a meta-feature-based technique in which the researchers built signatures based on pairs of analytes and their relative abundance. They applied this technique to both their cross-sectional effort – in which they used eleven proteins comprising eight pairings – and in the longitudinal analysis – which relied upon nine proteins comprising five pairings – finding in both cases that it provided significantly higher accuracy than signatures based on unpaired analytes, which achieved sensitivities and specificities of between 65 percent and 86 percent depending on the panel components.
The researchers performed their initial prioritization of the ADNI markers using the combinatorial optimization-based (alpha,beta)-K-Feature Set method that Moscato and his team first introduced in 2003. He suggested that, while they have not yet been widely adopted by biomarker researchers, such combinatorial optimization techniques – which focus on selecting optimal objects from finite sets – could prove a useful approach to the analysis of such datasets.
"Biomarker discovery — and particularly the area of multivariate signatures, both in terms of gene expression and proteomics — is absolutely suitable for combinatorial optimization," he said. "I would even say that it is the right tool for the job."
"With [conventional] statistics you have a hypothesis and you try to prove this hypothesis correct: 'Is this single biomarker related to a disease or not?'" Moscato noted. With multivariate analysis, on the other hand, "it becomes a problem of doing a search on many combinations between biomarkers, and that is one of the key ingredients of our work [using combinatorial optimization.]"
Moscato and his colleagues have applied this technique to the analysis of Alzheimer's data beyond the ADNI set as well, most notably of the Alzheimer's plasma proteome data published by Stanford University researcher Tony Wyss-Coray in a 2007 Nature Medicine study.
That study identified a panel of 18 proteins that the Stanford researchers said could distinguish between Alzheimer's patents and controls with roughly 90 percent accuracy as well as identify MCI patients who progressed to Alzheimer's.
In a re-analysis of the Stanford data published in PLoS One in September 2008, however, Moscato and his colleagues found that using their (alpha,beta)-K-Feature Set method they were able to identify a five-protein panel that surpassed the performance of the original 18-protein signature.
They followed this analysis in 2011 with another paper in PLoS One in which they employed a meta-feature-based approach using protein pairs similar to their analysis of the ADNI dataset.
Since the initial publication of the Stanford Nature Medicine paper, questions have been raised about the work's reproducibility, with Wyss-Coray himself telling ProteoMonitor that he and his collaborators had been "probably a bit naïve and too enthusiastic" about the markers they identified.
Moscato suggested, though, that some of the trouble researchers have had reproducing the Stanford results could be due to improper statistical analyses rather than problems with the underlying data.
In particular, he noted, the inappropriate use of z-scores – a measure of how much a data point deviates from a dataset's mean – "plagues the [biomarker] field."
"People do not realize that sometimes the results [they are reanalyzing] have been produced using a z-score, and when you use a z-score that ties you completely not only to the particular technology that has generated the data, but also to the fact that the z-score [contains] information about all the other proteins in the panel," Moscato said.
He cited in particular a recent study by a Lund University team in which they tried, and failed, to duplicate the Stanford results. (PM 1/30/2012). While the Stanford researchers had started with an analysis of 120 proteins from which they eventually selected a panel of 18, the Lund researchers in their study measured only those 18 proteins, making, Moscato said, a comparison of the two studies' results problematic.
To reduce this problem, better access to researchers' raw data is key, he said. "It is absolutely essential to have open access to [raw] datasets. I know this is difficult, but it is the only way."
Have topics you'd like to see covered in ProteoMonitor? Contact the editor at abonislawski [at] genomeweb [.] com.