NEW YORK(GenomeWeb) – Using Swath mass spec, a team led by researchers from the Swiss Federal Institute of Technology has completed one of the largest twins studies of human plasma proteins to date.
Detailed in a paper published last week in Molecular Systems Biology, the study provides information on sources of variability in plasma protein expression, both within individuals and across populations, and holds potentially valuable insights for clinical biomarker development, said ETH Zurich researcher Ruedi Aebersold, senior author on the paper.
Using samples from the Twins UK Adult Twin Registry, the researchers examined blood draws taken at two time points roughly four to seven years apart from 44 dizygotic and 72 monozygotic twins. Via Swath-MS, they quantified 342 plasma proteins in these 232 plasma samples, analyzing variability in their expression levels and the underlying causes of this variability.
This analysis found significant amounts of variability across many of the proteins measured. In total, 74, or more than half, of the proteins demonstrated more than a 10-fold change across the entire cohort.
This high level of variability across the cohort was somewhat surprising, Aebersold told GenomeWeb. Perhaps more interesting from a biomarker discovery perspective, he noted, was the finding that the proteins differed significantly in their level of variability.
As the authors wrote, the standard deviation of a protein fold change from its average level ranged from lows of 0.1403 for antithrombin III and 0.1465 for vitamin D-binding protein to highs of 1.1936 for apolipoprotein(a) and 1.6871 for serum amyloid A-1 protein.
"So some proteins are not controlled as much, while there are some [others] that are apparently controlled quite tightly, probably for biological reasons," Aebersold said. "We think this has implications for biomarker research, because if you focus on a relatively small cohort and you inadvertantly run into highly variable proteins, there is potential that you will misclassify [subjects] just by chance."
In terms of study design, this means researchers need significantly larger cohorts to demonstrate the utility of highly variable proteins as biomarkers than they would for more tightly controlled proteins, he added.
Researchers have been aware of this issue of protein variability, Aebersold said, citing previous work byArizonaStateUniversityresearcher Randall Nelson that used antibodies to follow the variability of a smaller set of proteins in a cohort. Nonetheless, he said, the issue remains "in the back of [researchers'] minds, if at all."
"I have done a lot of biomarker work, and this has never been in the forefront," he said. "We measure differences between cohorts of healthy and controls but have never really had a handle on the variability [within the cohorts]. So for us this is new and relatively new for the field."
Significantly, because Aebersold and his colleagues used the twin cohorts for the study, they were able to assess not only variability in protein expression, but also the underlying sources of that variability.
Monozygotic twins are genetically identical, and so therefore should exhibit essentially no hereditary variability. Dizygotic twins, on the other hand, "share roughly half of the identical by descent genetic variability," the authors noted. And so by comparing the monozygotic and dizygotic protein data, the researchers were able to determine the various underlying sources of protein variability.
Use of a DIA method like Swath was key to this comparison in that it allowed the researchers to reproducibly quantify a relatively large number of proteins across a large sample cohort, Aebersold said. The alternative would have been using a targeted mass spec approach like multiple-reaction monitoring, which, he noted, would have been considerably more time consuming and challenging from an assay development perspective.
Roughly 50 percent of variance across all the proteins measured was due to experimental design factors such as the effect of short-term protein concentrations and subject diet. The remaining 50 percent – the biologically stable portion – consisted of 13.6 percent heritability, 10.8 percent common environmental effects, 11.6 percent individual environmental effects, and 13.6 percent longitudinal effects.
Individual proteins differed considerably in terms of the factors underlying their variance. For instance, the levels of 60 proteins were tightly linked to longitudinal changes – variability in expression in an individual over time. Another 52 were closely associated with familial environment and another 47 with individual environment, while for 80 heritability was statistically significant.
Using Gene Ontology and pathway enrichment analysis, the researchers looked at which biological functions appeared most regulated by which forms of control, finding, for instance, that blood coagulation was highly heritable while functions like lipid metabolism were both highly heritable and tightly linked to individual environmental effects, while hormone response was most closely associated with longitudinal effects.
The researchers also looked at the variability of 42 proteins approved by the US Food and Drug Administration for clinical use, finding that, in general, they had lower variability than the other proteins they measured. Aebersold noted that this made sense, as the biomarker discovery process would, to an extent, work to weed out proteins that are highly variable.
He added, however, that with a better understanding of the sources of variability, some proteins eliminating during discovery experiments might be rescued and, ultimately, prove useful.
"Proteins might be at some level useful if, for instance, they are very variable within a population but very stable within an individual," he said. "So I think that if one understands the variability one could also rescue some proteins that are thrown out in a cohort."
Also interesting, Aebersold noted, was the fact that the genetic contribution to variability diminished between the two time points in which the subjects were tested.
"It's possible that genetic control diminishes because we, over time, accumulate genetic defects like mutations and they will obviously have some form of impact on the cells," he said. "So I'm not really surprised, but it's nice to see this in numbers."
It could also have implications for biomarker studies in that "you would need to factor in whether you are doing this in an older population or a younger population," he said.
Aebersold noted that while twin samples like those used in the MSB paper are quite precious, the variability data they enabled "would be useful to factor into" biomarker work – particularly, if, for instance, "one had a really promising marker."
Leigh Anderson, chairman and CSO of targeted proteomics firm SISCAPA Assay Technologies, told GenomeWeb that while little considered by the biomarker research community, understanding the sources of variation in protein biomarkers "is extremely important for a range of reasons."
For instance, "if you are measuring the amount of a protein that is purely genetically determined, then you are looking at a genomic biomarker rather than a proteomic one," he said. "Whereas, if you are looking at something with no genetic regulation or very little, then you are looking at something that is really just pure phenotype, and knowing the difference between those I think is really important."
Anderson, who was not involved in the MSB research, said that he thought it "makes a huge amount of sense to measure the genetic control of [candidate] biomarkers before you really go deep into the selection of markers."
"That tells you a lot about what the value of a biomarker and, particularly, how useful it's going to be to measure it longitudinally," he added.
Anderson noted as well that the study's information on longitudinal variability highlighted an issue that the field is just beginning to grapple with.
"It's known that there are a number of proteins, like IGF1, that vary systematically by large amounts [over a person's lifetime]," he said. "This hasn't been very well studied, but it is of concern to clinical lab people because they have to set normal ranges, and in a case where the normal range varies with relation to age or sex, it just complicates their life a lot."