NEW YORK (GenomeWeb) – An international consortium of researchers has found evidence that the commonly used HeLa cancer cell line may be more heterogeneous that previously thought.
In a study published last month as a BioRxiv preprint, the researchers found that HeLa cell populations across laboratories varied significantly at the genomic, transcriptomic, proteomic, and phenotypic level. This variation occurred not only between cell line varieties known to be distinct, but also between stocks presumed to be homogenous and even within single specific stocks as they were sub-cultured over the course of an experiment.
As HeLa cells are one of the most common model systems in biological research, used or referenced in nearly 100,000 publications in the PubMed repository, these findings have significant implications for the research community, said Ruedi Aebersold, a professor at the Swiss Federal Institute of Technology, Zurich and the senior author of the study.
Aebersold said he and his colleagues decided to investigate the heterogeneity of HeLa cells as part of a larger interest in questions around reproducibility of life science studies. As the authors noted, reproducibility has emerged as a serious issue in biological research, highlighted by a pair of studies in 2011 and 2012 in which pharma researchers found they were unable to replicate 80 to 90 percent of the results of major life science research papers.
"Basically, we wanted to generate an argument to [help] say why things don't turn out identically," Aebersold said. "Perhaps the underlying reason is not necessarily because they are incompetent or cheap, but because the biology is much more complicated [than presumed]."
That HeLa cell populations are heterogeneous is a real problem, said Wolfgang Huber, a group leader in multiomics and statistical computing at the European Molecular Biology Laboratory. The BioRxiv preprint is "probably the best quantitative study of the phenomenon so far," he said.
Huber, who was not involved in the research, has published work examining the genomic and transcriptomic landscape of HeLa cells.
Aebersold said it was known since researchers began using HeLa cells in the 1950s that the line had diverged into at least two distinct populations, the CCL2 and Kyoto varieties.
"It's totally clear [from this divergence] that these cells eventually drift apart," he said, adding that despite this knowledge, researchers do not commonly report which variety of HeLa they worked with when publishing their results.
This observed divergence suggested that there might be even more widespread heterogeneity within HeLa cells than is generally assumed. To test this notion, the researchers collected 14 aliquots of HeLa cells from 13 labs around the world. They then cultured them in uniform conditions and measured their genome-wide copy numbers, transcriptomes, proteomes, and protein turnover rates. They used array-CGH to measure gene copy number variation, mRNA sequencing for transcriptomic profiling, and Swath mass spec on a Sciex 5600+ TripleTOF instrument for proteomic profiling. In addition, they used pulse-chase SILAC (pSILAC) labeling, which measures the rate at which isotopically labeled amino acids are incorporated into proteins, to assess protein turnover rates.
The researchers also investigated the response of each strain to Salmonella infection to look at the phenotypic effects of the molecular heterogeneity.
The authors observed what they characterized as "a considerable degree of large-scale [copy number variation] across HeLa cells cultured in different labs," even, they noted, "among strains with the same annotation." These genomic differences translated to transcriptomic and proteomic variability that was "comparable to the variability observed between cells of different tissue origin," they wrote. These molecular differences further translated into phenotypes, including different growth rate and resistance to Salmonella infection.
The differences were most pronounced between the CCL2 and Kyoto varieties, which led the authors to "strongly suggest that all future HeLa-related studies should at least clearly report the identities of CCL2 or Kyoto for the cells used."
However, even within the same cell population, the researchers observed variation over time. In the case of CCL2 cells, after three months of culturing, they found "copy deletions or gains of some whole chromosomes, 6 to 7 percent of gene differential expressions, and quantitative changes of pathways."
This, they noted, meant that even over the course of a moderately long experiment, researchers can expect to see significant molecular changes in the cell population they are studying, which could affect their results.
"It came out that these cells are not only different in different labs, but even within the same lab, they change," Aebersold said. "In a couple of months, they are not the same cells anymore."
He said the findings were not only relevant to HeLa cells but were likely generalizable to other cancer cell lines used as model systems for research.
Aebersold said he did not believe the findings meant researchers should stop using cell lines in their work, but he suggested that some sort of benchmarking and quality control are needed.
"What I think would be a good middle ground, or an actually achievable ground, would be that if someone published a paper with some cell lines, they then should include some form of molecular description [of those cells]," he said, suggesting that, for instance, researchers could include a transcriptomic or proteomic analysis of the cell lines they used in their work.
If two people came to different conclusions, you could use those data to "very easily go back and say, well you were basically working with different cells," Aebersold said. "They both go by the name of HeLa, but they are not the same cell."
He noted that the relative speed and ease of transcriptomic and proteomic profiling makes such an approach more feasible than it might have been in the past. And while not all researchers will have this expertise in-house, most will have access to a core lab capable of performing such an analysis, he said.
While the findings are discouraging in one sense, Aebersold suggested they point toward an addressable problem potentially underlying the life sciences reproducibility crisis.
"If you identify the problem in some detail, then you can deal with it," he said. "So in that regard, I think it's actually a positive outcome, even though it complicates many matters for many people."
Furthermore, he said, were researchers to begin providing molecular characterizations of their cell lines as part of their publications, that data would be a highly valuable resource.
"Let's think that for all of the about 100,000 papers [published using HeLa cells], if each one of those papers had documented the transcriptome or the proteome of those cells and this was accessible [along with] the corresponding phenotypic or functional information that people had extracted from those cells," Aebersold said. "It would be fantastic resource."
"Of course, this wasn't possible over the decades, but from here forward, we are able to do it," he said. "If one did actually capture the molecular makeup [of these cells] and related it to the sometimes extremely elaborate biological insights that are being generated, I think it would be a great thing for biological knowledge."
Huber agreed that providing more extensive genomic or proteomic data on cell lines used in an experiment could be useful for, for instance, "debugging and determining what went wrong in a given study."
However, he said that the concern about cell line heterogeneity affecting the reproducibility of an experimental finding might be misdirected.
Rather than worry about heterogeneity within cell lines causing issues with reproducibility, Huber suggested researchers should worry that results affected by this sort of heterogeneity are not robust enough to be meaningful.
"I think the real solution is scaling up the number of cell lines used — not doing experiments on one or two or three, but on many," he said. "It’s the same with patient-derived tissues. Ideally, you don’t study just two or three samples, but dozens, hundreds."
High-throughput omics technologies make this more plausible than it might have been in the past, Huber said.
"We have the capability now to do experiments in higher throughput, and we are moving towards experimental designs where we use many different cell lines, so you can see that the effects that you observe are robust across many different clones," he said. "The important thing is not to converge on one super-specific and highly defined model system, but rather to make sure that the statements we are making based on these model systems are robust and largely independent of such variation."