Diverse strategies for sequencing single cells were front and center at this year's annual Biology of Genomes meeting at Cold Spring Harbor Laboratory in New York, with four researchers presenting information on methods to sequence individual cells as part of cancer, genetic mapping, and other studies.
Building on single-nucleus sequencing work that he presented at the same conference two years ago as a member of Michael Wigler's Cold Spring Harbor Laboratory (IS 5/18/2010), Nicholas Navin outlined the newest iteration of the method, called Cell-Seq, at this year's meeting.
The new method reduces the amount of amplification needed for each cell by using flow-sorting to specifically nab single cells from the G2/M stage of the cell cycle that have already doubled their DNA content by replication, explained Navin, now at the MD Anderson Cancer Center.
The post-replication single-cell sequencing method was designed to address coverage limitations of the single-nucleus sequencing approach, which generates sufficient coverage for analyzing copy number but not for looking at base pair-level mutations across the genome in individual tumor cells.
The Cell-Seq method also employs so-called "transposon tagmentation," a combination fragmentation and adaptor ligation step that is suitable for small amounts of input material.
To validate the approach, the team tested cells from a relatively stable breast cancer line called SK-BR-3 that appears to be monoclonal based on cytogenetic and copy number profiling, Navin said during his presentation.
Indeed, when they sequenced two single cells from the line to more than 50-fold coverage each over 84 percent and 88 percent of the genome, respectively, with the Illumina HiSeq 2000, the team was able to see almost all of the single nucleotide mutations and small insertions and deletions that were present in sequences generated for a SK-BR-3 cell population sample.
For instance, investigators were able to find more than 3 million single nucleotide changes in the single cells compared to around 3.3 million in the population sample, including more than 96 percent of the non-synonymous mutations. Sequences for the single cells also revealed almost 93 percent of the indels detected in collections of SK-BR-3 cells.
They also found some apparent mutations in the single-cell data that did not fall out of the population sequence data, Navin said. Single-cell PCR and other follow-up studies suggested a significant proportion of these were authentic variants.
The team has since applied the Cell-Seq approach to investigate cancer mutation and evolution patterns in a breast cancer patient whose tumor showed a highly clonal profile by copy number profiling.
In contrast, researchers from CSHL have taken the single-nucleus sequencing method in a different direction, streamlining the number of reads per cell needed to do sequence-based copy number analyses on individual cells.
Though both groups are interested in cancer evolution, the distinct approaches currently being used by Navin's group and the CSHL researchers make it possible to approach the problem from different directions, according to Timour Baslan, a graduate student in James Hicks' CSHL lab.
"We diverged in the way that we asked questions: Nicholas [Navin] wanted to know every single base pair in the genome and what we wanted was to get a lot of single cells and apply population genetic theory to our data," Baslan told In Sequence.
For previous single-nucleus sequencing studies, the team was running one cell per lane on the Illumina GAII at a cost of around $1,500 to $2,000 per cell — still a steep figure for those looking to do population or other studies on large numbers of single cells or tests on multiple single cells in a clinical setting.
In an effort to increase the throughout of the approach, Baslan and his colleagues from James Hicks' CSHL lab ran simulation analyses on sequence data generated for a single-cell breast cancer sequencing study that Navin and co-authors published in Nature last year.
In these sub-sampling simulations, the team looked at the results they got using 10 million, 8 million, 4 million, 2 million, 1 million, 500,000, or 250,000 reads selected at random.
When they divided the genome into 50,000 bins, for example, they found that 2 million reads were required to do copy number analysis on a single cell.
That dropped to 250,000 reads per cell when they divided the genome into 5,000 rather than 50,000 bins — a level where it is possible to multiplex hundreds of barcoded single cells together on a single sequencing lane.
"We investigated the data in a cautious manner and found out what the limited requirements are and then used molecular biology and barcoding to be able to apply high-throughout multiplexing," Baslan said.
He noted that the team has already used the approach to test thousands of cells as part of population and evolutionary studies of breast, colorectal, ovarian, and lung cancers.
The group is also collaborating with other groups to look at the feasibility of using a single-cell sequencing method for other basic biological studies and in the clinical arena as a means of testing circulating tumor cells, monitoring treatment response, and so on, Hicks told IS, explaining that the approach is well suitable for characterizing rare cells within a population.
For example, in collaboration with Memorial Sloan-Kettering Cancer Center's Howard Scher, Hicks said the team is looking at single-cell copy number patterns related to treatment outcome in prostate cancer patients (CSN 9/7/2011).
The high-throughput, reduced read single-nucleus sequencing method is compatible with any sequencing platform, though Baslan noted that the bin boundaries will differ somewhat depending on the technology used, owing to differences in the error rate models for different platforms.
A recent Nature Protocols paper describing the approach contains hints for re-doing the simulations using other platforms.
The cost per cell is currently around $60, though the group is working to bring that closer to around $10 by tweaking the method. For instance, Baslan noted that they are exploring the possibility of incorporating microfluidics strategies into the method.
Stephen Quake's Stanford University group is already applying microfluidics to single-cell sequencing, having spent several years developing ways to amplify and sequence single cells or individual chromosomes isolated with microfluidic devices.
In a Nature Biotechnology study, for example, the team showed that it could amplify, genotype, and do haplotype analyses on individual chromosomes that had been isolated from a single cell using Fluidigm microfluidic devices — an approach that makes it possible to determine the parental source for each chromosome and do recombination mapping (IS 12/21/2010).
To help with some SNP phasing, the researchers also used the Illumina GAII to do light, paired-end sequencing on amplified individual chromosomes, generating sequence that covered the chromosomes to a depth of between 3.5 times and almost eight times, on average.
Using a similar microfluidics approach, they have also isolated and sequenced individual bacterial cells, amplifying DNA from each by multiple displacement amplification, Quake explained. He estimated that the current cost of sequencing single=cell bacterial genomes is around $1,000 apiece.
The researchers have now started to use microfluidics to separate and sequence individual sperm cells for recombination studies and Quake noted that they are also using single-cell methods to interrogate individual cancer cells.
As technical challenges associated with single-cell sequencing start to get resolved, the issue of deciding whether to use the approach is becoming an issue of cost per data point rather than technical feasibility, Quake told IS, explaining that the benefit of having genetic data for individuals cells may outweigh the added sequencing costs for some studies.
Moreover, he emphasized that there are a wide range of applications for the single cell amplicons other than whole-genome sequencing, including genotyping, haplotyping, and analyses on targeted regions of the genome.
Last week, Fluidigm, a company co-founded by Quake, announced that it is collaborating with the Broad Institute to launch a Single-Cell Genomics Center at the Broad's headquarters in Cambridge, Mass.
Quake, who currently chairs the company's scientific advisory board, is not directly involved with the new collaboration but said it will involve the use of microfluidic devices similar to those that he and his colleagues are using for single-cell isolation and amplification.
In their own genome mapping studies, Albert Einstein College of Medicine's Adam Auton and his collaborators from China and the UK sequenced and compared individual sperm cells from the same donor individual, an unidentified, 50-year-old Asian man, to look at recombination events in his cells.
As Auton explained during his Biology of Genomes presentation, the researchers started by sequencing MDA-amplified DNA from just over 200 individual sperm cells at relatively low depth. These included three sperm cells sequenced to 5-fold coverage, 10 cells sequenced to 2-fold coverage, and 193 cells sequenced to 0.2-fold coverage.
They also generated 10X coverage of blood DNA from the same individual by Illumina paired-end sequencing and had access to array data on the donor's mother, wife, and son.
When they started looking at recombination events in the low coverage sperm genomes, though, the investigators found some sperm genomes with heterozygous sequences and/or sequences from both sex chromosomes.
These features suggested that at least some of the cells that had been sequenced were contaminants since such features shouldn't be present in sequences generated from haploid sex cells.
For their recombination subsequent analyses, Auton and his colleagues focused on 40 sperm cell sequences that appeared to be authentic based on coverage and sequence patterns.
From this subset of single cells, the group got new clues about the frequency of crossover events in the donor individual's cells as well as recombination hotspot usage, which appeared to vary somewhat from one cell to the next.