Skip to main content

A Fix for Expression Analyses


Researchers conducting global gene expression studies may be assuming too much when it comes to interpreting their data, according to Whitehead Institute scientists. In a paper published in October in Cell, the researchers question the assumption that all cells produce similar levels of messenger RNA — an assumption they say has likely led to erroneous interpretations about the relative regulation of genes in different cell types. In their paper, they use three different gene expression analysis methods to demonstrate that such a problem exists, and propose a standardization method using synthetically produced RNA "spike-ins" to produce better assessments of changes in steady-state levels of RNA. Ben Butkus recently spoke with Whitehead's Tony Lee about the findings. What follows is an excerpt of their conversation.

Genome Technology: Your main finding is that there is cell-to-cell variation in the amount of mRNA produced, correct?

Tony Lee: Yes, more or less. In this particular experiment we're doing this in [different] cell types or cell conditions, so we're not looking at single-cell resolution.

GT: Had this potential variation in mRNA production by cell type previously been hypothesized?

TL: It's something that is relatively well known. For instance, when expression arrays first came out, a lot of their users studied what happens to transcription genome-wide when you knock out certain factors that are important for transcription. At that time, there was recognition that if you knocked down something that was important for general transcription, you were going to affect all of transcription.

We … formally showed this effect across a number of different gene expression platforms; and more importantly tried to show one way to … tackle this potential problem in the future.

GT: What gene expression analysis technologies did you use to interrogate this variation?

TL: So far we've used DNA microarrays from Affymetrix; RNA sequencing technology [from Illumina]; and NanoString's [digital counting] technology. The important part of the paper wasn't really testing the platforms against one another. We were testing whether all three … end up with the same bottom line, which is that with the standard normalization you get expression indicating that many genes are unchanging, a few change up, and a few change down; and when you revisit normalization you get a very different impression. The effect of the normalization was constant across all three platforms.

GT: What are the implications of this for prior and future gene expression analysis research?

TL: The bottom line is that it's difficult to anticipate when this kind of phenomenon might be happening. For [future] gene expression experiments, we think it's probably a good idea to use this type of standardized control.

GT: Do your results imply that all previous gene expression analysis studies need to be looked at again to make sure this didn't happen?

TL: We haven't really come up with a way to salvage, so to speak, old data. You could probably reconstruct it if you had tracked the cell numbers and the total amount of RNA you were getting. That's a little bit complicated because most people don't track total RNA. In addition, it's actually changes in total mRNA production that are the most problematic, because that's what you're measuring with gene expression analysis. We actually have seen situations [in which] the total RNA doesn't seem to change very much, but the total mRNA is changing quite a lot. Normalization back to the cell number is actually the important factor.

We don't want to suggest this is a situation where everything that has been done is wrong. It's just another thing where researchers should be aware of the assumptions that went into their experiments.

The Scan

Pfizer-BioNTech Seek Full Vaccine Approval

According to the New York Times, Pfizer and BioNTech are seeking full US Food and Drug Administration approval for their SARS-CoV-2 vaccine.

Viral Integration Study Critiqued

Science writes that a paper reporting that SARS-CoV-2 can occasionally integrate into the host genome is drawing criticism.

Giraffe Species Debate

The Scientist reports that a new analysis aiming to end the discussion of how many giraffe species there are has only continued it.

Science Papers Examine Factors Shaping SARS-CoV-2 Spread, Give Insight Into Bacterial Evolution

In Science this week: genomic analysis points to role of human behavior in SARS-CoV-2 spread, and more.