Skip to main content
Premium Trial:

Request an Annual Quote

Genomic-Proteomic Mouse Study Offers New Insights Into Regulation of Protein Expression


NEW YORK (GenomeWeb) – Combining in-depth proteomic characterizations with a highly genetically variable mouse model, researchers at The Jackson Laboratory and Harvard Medical School have gained new insight into the links between genes, transcripts, and proteins and mechanisms regulating their expression.

In a study published this week in Nature, the researchers presented data suggesting that a given protein's abundance is significantly regulated by the expression levels of other proteins with which it forms stable protein complexes. This so called "stoichiometric buffering" could help explain the long-observed discordance between transcript and protein levels and provide a new understanding of various protein complexes and biological pathways, Steven Munger, a Jackson Labs researcher and study author, told GenomeWeb.

The researchers looked at protein and transcript levels in liver tissue from 192 diversity outbred (DO) mice, a highly genetically diverse mouse model Jackson Labs has developed.

The mice "are like a reservoir of natural genetic variation," Munger said, adding that with roughly 50 million SNPs and 2 million insertions and deletions, the model has more genetic variation than the human population.

"For almost every protein-coding gene, there are one or more natural variants that act to perturb the expression of those genes," he said, which makes the model well-suited to mapping quantitative trait loci.

In collaboration with Harvard Medical School researcher Steven Gygi, who handled the proteomics analysis, Munger and his co-authors generated proteomic and transcriptomic profiles of the 192 mice, measuring 6,756 proteins and 16,921 transcripts. Focusing their analysis on the 6,707 proteins which, along with their corresponding transcripts, were detected in at least half of the liver samples, the researchers used transcript and protein QTL mapping to unravel the genetics underlying the observed differences in mRNA and protein expression.

Looking at the proteomic and transcriptomic data, the researchers identified what Munger described as "two very distinct classes of proteins." The first were proteins that were affected by local genetic variations that were close to the actual protein-coding gene.

In the case of those proteins, it appeared that their abundance was typically controlled via transcriptional mechanisms. "So you think of the central dogma where DNA affects RNA which affects proteins," Munger said. "These were cases where the genetic variation was affecting transcript levels first and then that was being translated into differences in protein abundance."

In all, he said, around 80 percent of the proteins the researchers looked at were regulated transcriptionally.

This, Munger noted, was an interesting finding in and of itself given a number of previous studies that have found low correlation between RNA and proteins.

"This was kind of in contrast with the last fifteen or so years, there has been lost of controversy over reported low correlation between the protein and transcript abundance," he said. "But it turns out that if you have genetic variation in almost all of your genes [as is the case with the DO mouse model], you are going to find a large group of proteins where the transcript variation matches the variation in protein abundance."

Perhaps more interesting, though, were the cases where variations in protein abundance were regulated not by a nearby gene but by a distant genetic variation — one on a different chromosome, for instance.

The researchers found that nearly all of these proteins were regulated at the post-transcriptional level, Munger said. "This was completely unexpected, because you would think if there was genetic variation somewhere in the genome that was affecting a protein's abundance it might be variation in a transcription factor or something where that variation would affect the target transcript abundance before affecting the protein abundance. But it looks like that regulation happens at a post-transcriptional level almost completely."

This led Munger and his colleagues to hypothesize that the distant variant was acting on another transcript which altered the expression level of its corresponding protein, which then affected the expression of the target protein being investigated.

"The abundance of that [transcriptionally regulated] protein is actually able to confer the effect on our target protein so we call that [initial protein] a mediator," he said.

Such mediation is an indication that the two proteins are interactors of some sort, which, Munger said, raised the possibility of using the genetic variation information generated in their experiment to identify new protein-protein interaction and possibly infer biological networks.

To do this, the researchers used an approach called mediation analysis that allowed them to identify proteins and transcripts regulated by local genetic variants than then regulated by one or more other proteins at a post-transcriptional level.

By doing that they were able to identify known binding partners and networks of genes and also new binding partners and networks of genes," Munger said. They also identified the aforementioned "stoichiometric buffering" phenomenon, wherein it appeared that transcriptional regulation of one protein in a complex resulted in post-transcriptional regulation of the other proteins in that complex.

He cited as an example the chaperonin-containing TCP1 (CCT) complex.

"We found that all the proteins in that complex shared a variant on chromosome 5 that affected their protein abundance but not their transcript abundance," he said. One of the CCT proteins, though, was coded by a gene on chromosome 5 and was affected transcriptionally by a nearby mutation. And in mice with that mutation, that protein had lower transcript and protein abundance, which in turn lowered the abundance, at the protein but not the transcript level, of the other proteins in the CCT complex.

Essentially, the transcriptional regulation of that protein on chromosome 5 "kind of set the minimum amount of the other proteins that could be bound in a stable complex," Munger said. "Not all the examples follow this, but we are pretty confident that protein stoichiometry plays a huge part in regulating steady-state protein abundance and also buffering protein abundance against genetic variation that could affect transcripts."

Another example of this effect, he said, was the dystrophin complex associated with muscular dystrophy.

"There are a number of proteins in that complex, and mutations that affect any one of those will cause varying degrees of dystrophy-like phenotypes," he said. "And [researchers] have also seen that if one of the proteins in that complex is mutated, the levels of some of the other proteins in that complex are also lower. They weren't sure why, and this is exactly why."

"When you have proteins that are bound in a stable complex, the evolutionary pressure seems to be on getting those proteins bound stably or else they are cleared quickly," he added. "I think you just don't want aggregation."

Munger said he and his colleagues had looked at an additional tissue and were in the process of looking at a third and had thus far found the same pattern to hold. They also validated the findings in a different population of mice.

Munger said that in addition to validating the stoichiometric buffering concept, he and his colleagues are also working to use it to manipulate specific biological pathways with the aim of developing new mouse models.

"I am seeing if I can manipulate other genes in a complex and then take over the regulation of that complex," he said. "I am also looking at specific liver pathways and seeing if I can predict combinations of crosses between inbred mouse strains that will produce a pathway that is either sensitized or particularly robust. So we can start to use these principles to help us make better animal models."

Munger suggested that the study highlights the importance of studying systems on the proteomic level.

"If you take the set of nearly 7,000 proteins where we have both protein and transcript abundance, what we find is that for a large number of those proteins, genetic variation can be identified that affects transcript abundance but doesn't seem to affect protein abundance at all," he said.

"A lot of times in human GWAS studies when they try to go find the genes in those regions that are associated with disease, one of the things they first look for are variants that affect transcript abundance," he said. "But what we are seeing in our population is that up to half of those may not affect protein abundance, and so to get a really good read out of how that genetic variation is affecting the proteome, you have to have the protein measurements."