NEW YORK – Two research teams have compared a multitude of single-cell RNA sequencing approaches and set new benchmarks for researchers who use scRNA-seq in their studies.
The comparison projects, published last week in Nature Biotechnology, emerged from the ongoing Human Cell Atlas project and could help scientists decide on the best approaches to use in a rapidly evolving field.
"It's kind of a moving target," said Holger Heyn, team leader of single-cell genomics at the Center for Genomic Regulation in Barcelona and corresponding author on one of the papers, "Benchmarking single-cell RNA sequencing protocols for cell atlas projects." "There are new techniques coming out every other month, and what we tested in these studies were the most common techniques out there," he said.
"We tried to move as quickly as we could, we realized there was a window where our experiments would be more relevant," said Joshua Levin, a senior group leader and research scientist at the Broad Institute and the corresponding author on the second paper, "Systematic comparison of single-cell and single-nucleus RNA-sequencing methods."
Levin said the idea for such a benchmarking study dated back to discussions within the standards and technology working group of the HCA in 2017. Ultimately, two experimental designs were devised: a team led by researchers at the CRG in Barcelona decided to evaluate 13 protocols across several centers, while a Broad-led team of investigators decided to look at seven methods for scRNA-seq within its own laboratory.
There was, of course, overlap between the approaches evaluated in each study, and preliminary versions of both papers appeared on BioRxiv last year. Both Heyn and Levin said they added supplementary figures and data to the final papers, though their main conclusions did not change.
The researchers at CRG cast a wide net when they decided on their study design by not only looking at 13 different protocols but also involving their developers. The resulting paper therefore has nearly 40 authors with 25 different affiliations, among them the Broad's Levin.
"We tried to involve everyone who is expert in their own technologies to ensure that each technique was run in the correct way," said Heyn. By running the study across multiple sites, the researchers also felt they represented the way data is actually generated within the HCA project.
"We had a multicenter study design, which is more in line with what is being done at the Human Cell Atlas project where many different datasets are produced at different sites with different technologies," noted Heyn.
The 13 protocols evaluated in the CRG-led effort included two widely used, low-throughput plate-based methods: Smart-seq2, a method developed at the Karolinska Institutet in Stockholm, and CEL-Seq2, which was developed by researchers at the Israel Institute of Technology. They also included Drop-Seq and InDrop, droplet-based methods that were both developed at Harvard. The latter has been commercialized by Watertown, Massachusetts-based 1CellBio.
Among the higher-throughput methods, they evaluated 10x Genomics' Chromium for single-cell and single-nucleus sequencing; Quartz-seq2, a high-throughput method developed at Japan's Riken research institute; MARS-Seq2, developed at the Weizmann Institute of Science in Israel; gmcSCRB-seq, created at the University of Munich; BioRad's ddSEQ droplet-based microfluidic system; Takara Bio's ICELL8 system; as well as Fluidigm's C1 for small- and medium-cell RNA expression. The number of methods underlies the competitiveness of the field. Both 10x and 1CellBio, for instance, have been involved in a lawsuit related to the technology since 2018.
To benchmark these different protocols, the CRG-led team designed a reference sample consisting of human peripheral blood mononuclear cells and mouse colon. To mitigate any variability due to library preparation, a single batch was generated and then distributed to those taking part in the study. In total, they profiled about 3,000 cells with each protocol.
Each of the methods was scored according to a number of performance metrics, including gene detection and marker expression, as well as data integration aspects such as clusterability, mappability, integrated clusterability, and mixability.
According to the authors, there were "marked differences" among the protocols evaluated depending on the metric, though Quartz-seq2, Chromium, Smart-seq2, and CEL-Seq2 rounded out the top of the overall benchmarking score list. In general, they found that plate-based methods, such as Quartz-seq2, CEL-seq2, and Smart-seq2, reliably generated high-resolution transcriptome profiles. They also said microfluidics systems, such as Chromium, "showed excellent performance" in the comparison.
In some metrics, though, such as gene detection, droplet-based methods such as Drop-seq and inDrop fell lower on the list, as did ICELL8, MARS-seq, and gmcSCRB-seq. The latter three had the lowest overall benchmarking scores.
"Obviously, Riken is very happy that Quartz-seq performed best, and 10x's Chromium was not so bad either, but just because the last technique performed last in our comparison, it does not mean it is a bad technique," noted Heyn. However, he said the information provided in the benchmarking study should be of use to researchers as they shop around for the best methods.
"I think it helps in the decision-making process toward which techniques could be implemented in the lab now," said Heyn. "It's a benchmarking scoring game, but there are many other considerations, too," he said. "We wanted to inform the single-cell community to make them aware of what the outcomes can be from each technique."
He noted that all the techniques observed in the paper are "quite accurate," and said that the findings should not reflect poorly on studies done to date with methods that scored lower than others. "What you are measuring is real," noted Heyn. "It's not technical artifacts. It's not a question of accuracy, it's a question of resolution."
The researchers also provided a nuanced comparison of costs. "Generally, late multiplexing methods, such as Smart-seq2, are more costly, but costs can be reduced by miniaturization and use of noncommercial enzymes," they wrote. "Custom droplet-based protocols have lower costs than their commercialized counterparts, but the optimized chemistry in commercial systems resulted in improved performance," they noted.
"It's very lab-dependent how protocols are set up," said Heyn. "Especially smaller volumes, run on more expensive robots, decrease the costs for plate-based methods."
Benchmarking at the Broad
For the Broad-led study, the researchers decided to also look at Smart-seq2, CEL-Seq2, 10x Chromium, Drop-Seq, and inDrop. They also decided to assess Seq-Well, a method developed at the Massachusetts Institute of Technology, and sci-RNA-seq, a method developed at the University of Washington that relies on a combinatorial indexing strategy as part of their study.
To carry out the benchmarking, they analyzed three sample types, a mixture of human and mouse cell lines, PBMCs, and mouse cortex nuclei, generating a total of 36 different scRNA-seq libraries and producing expression profiles from about 92,000 cells in total.
Levin said the team designed the study so that it could be replicated at a later date. In particular, he said, the sample types used are widely available. "Those samples are the kinds of samples that if you have a new method, you can go ahead and collect data for the same sample type and compare it," said Levin. "We didn't take some obscure sample type and test it on that."
Because every method assessed has its own accompanying computational pipeline, the researchers involved in the study developed their own software package called Scumi, which enabled them to directly compare across methods. According to Levin, Scumi includes a feature that allows users to streamline the removal of low-quality cells before downstream analysis. "That's another contribution," he said. "It's probably more subtle, but it's also in the Scumi code."
As noted in the paper, the results from the Scumi-generated datasets were compared to the various computational pipelines available for each tool and in general found to be consistent.
Levin's team assessed the seven different approaches according to numerous metrics, including sensitivity, reproducibility, technical precision, and ability to capture information about cell types. Overall, the researchers reported in the paper, the 10x Chromium had the "strongest consistent performance" with some caveats, such as an inability to assign an identity to some cells in cortex nuclei or detect all cell types present. They also found the lower-throughput approaches, Smart-seq2 and CEL-Seq2, to be roughly equivalent in performance to each other.
The Broad-led team also examined cost as part of the study. "Low-throughput methods are more expensive than high-throughput methods, as expected," noted Levin. He added that these calculations change depending on the number of cells in the experiment, so that while they reported actual costs, they might differ in other experiments to some degree. Levin noted that multiplexing strategies will likely lower costs in many cases.
"I think all the methods generate useful data and they have relative strengths and weaknesses," said Levin. "I think that's important to keep in mind." Going forward, he said the study results could inform developers of whether it makes sense to continue to invest in their methods. The paper could also guide decisions on what approaches are best for certain kinds of studies.
"I have been involved in other benchmarking studies and usually what happens is, the field coalesces around a method and the other methods are used less," said Levin. He cautioned that the field is still "fast moving" and said "it's hard to know what will come" next in scRNA-seq.
Aviv Regev, a computational biologist at the Broad and co-author on both studies, said in an email that the studies were "critical" for the HCA, as they provided "guidelines and rich information across protocols applied to map the body at single-cell resolution." Moreover, she said they "reflect the open and collaborative spirit of HCA, allowing us to take on comprehensive and complementary studies."
According to Levin, his team has no plans to undertake more benchmarking studies at this time, though he said such efforts are worth doing. CRG's Heyn, however, said his team is currently at work benchmarking single-cell ATACseq methods. They also recently published a preprint in BioRxiv concerning sampling artifacts in single-cell genomics cohort studies.
Various vendors were contacted to comment on the outcome of the studies, but none responded in time for this article. However, Gary Schroth, vice president and distinguished scientist at Illumina, said in an email that the studies "highlight that the different platforms each have their own advantages and disadvantages." According to Schroth, the platform and application chosen will depend upon research goals and sample type going forward.