Skip to main content
Premium Trial:

Request an Annual Quote

Directional Single Cell Sequencing Method Discovers Mistakes in Reference Genomes


Researchers at the British Columbia Cancer Agency in Vancouver have developed a directional single-cell sequencing method that allows them to selectively sequence the original DNA template strand a daughter cell has inherited from its parent.

The method, called Strand-seq, which was published online in Nature Methods this week, works by labeling newly synthesized DNA during replication and then selectively damaging these new strands in the daughter cells, so only the original parental strand gets sequenced.

As described in the paper, Strand-seq enabled the researchers to map sister chromatid exchanges in mouse embryonic stem cells at higher resolution than previously possible and to identify aneuploidy events and copy number variations in these cells.

They also discovered misoriented contigs and fragments in about 1 percent of the mouse reference genome assembly that had not been detected before.

Besides finding similar errors in other reference genomes, including the human genome, the scientists expect that their method will be useful for haplotyping, as well as for detecting genomic rearrangements, such as inversions and translocations, that are difficult to study by other means.

Peter Lansdorp, the senior author of the paper, said that he originally decided to develop Strand-seq because of his interest in studying possible epigenetic differences between sister chromatids, the two identical copies of each chromosome that get distributed to the two daughter cells when a cell divides.

Each sister chromatid contains one original template strand from its parent – either the Watson or the Crick strand – and one newly synthesized DNA strand. However, sister chromatids sometimes swap parts with each other in order to repair double-strand breaks. Such sister chromatid exchanges are usually rare, and when they occur frequently, this indicates genotoxic stress or a genetic disorder associated with cancer, such as Bloom's syndrome. Until now, it was impossible to map SCEs at high resolution.

Strand-seq builds on earlier work, published two years ago, in which Lansdorp and colleagues showed that the parental Watson and Crick strands can be identified in sister chromatids using chromosome orientation fluorescence in situ hybridization.

"Then we became more ambitious and thought, 'Can we maybe use that for sequencing?'" said Lansdorp, who is scientific director of the European Research Institute for the Biology of Ageing at the University Medical Center Groningen in the Netherlands and is still affiliated with the BC Cancer Agency.

For Strand-seq, the scientists cultured murine embryonic stem cells in the presence of bromodeoxyuridine for one round of DNA replication, so that only the newly synthesized strands would contain the nucleotide analog. Following cell division, they isolated single daughter cells, fragmented the DNA using micrococcal nuclease, and ligated adaptors to make Illumina sequencing libraries. Prior to the PCR amplification step of the library protocol, they used UV photolysis to introduce nicks into the BrdU-substituted strand, so that only the original DNA template strand could be amplified and become part of the sequencing library.

In total, they constructed 66 indexed single-cell libraries – 62 Strand-seq libraries and four standard whole-genome shotgun libraries – which they pooled and sequenced on the Illumina GAIIx or HiSeq 2000 using 76-base paired-end reads.

For the Strand-seq libraries, genome coverage ranged between 0.65 percent and 6.5 percent, and all Strand-seq libraries combined covered about 65 percent of the mouse genome.

Reads mapped back either to the forward or the reverse direction of the reference genome, indicating which parental template strand – either Watson or Crick – the daughter cell had inherited. If a sister chromatid exchange had happened, the read distribution between Watson and Crick changed.

"You can now map SCEs with sensitivity that's orders of magnitude higher than … using cytogenetic techniques," Lansdorp said. He and his colleagues are now applying the technique in projects with various cell types.

In chromosomes 10 and 14 of the mouse genome, they also observed complete template strand switches in all their libraries, which they could not explain by SCEs or translocations, and they went on to show that these probably resulted from misoriented contigs in the mouse reference genome. In total, they found 17 contig fragments, ranging in size from 167 kilobases to 13.1 megabases and covering almost 1 percent of the mouse genome, that are very likely incorrectly oriented.

Lansdorp said that in an updated version of the mouse reference genome, mm10, one of the biggest errors has already been corrected, and that his team is in touch with the Genome Reference Consortium that maintains the human, mouse, and zebrafish genome assemblies about incorporating their findings.

He said that his team has also identified errors in the human reference genome, which they plan to describe in a separate paper.

Besides correcting existing reference genomes, Strand-seq could also be useful for improving new genome assemblies. "For new genomes, this would tremendously accelerate the correct assembly of maps," Lansdorp said.

Strand-seq could also be helpful for haplotyping and studying copy number variations. Another possible application is the detection of balanced rearrangements, for example translocations, that are difficult to find otherwise.

In tumor studies, Strand-seq could also provide information about the mechanism of genome instability and fusion gene formation. "You would get a much deeper insight than what you would typically get from analyzing either bulk material or doing cytogenetics studies," he said.

The bottleneck for applying Strand-seq more widely right now is making the single-cell libraries, which is laborious and technically challenging. Lansdorp's group is working on automating this process and using smaller volumes.

The researchers would also like to increase the genome coverage from a few percent to maybe 20 percent, and are trying to identify the limiting steps at the moment. "We think it's the ligation but we haven't really explored that to increase the amount of DNA that is captured from a single cell," Lansdorp said.

The Scan

Gone, But Now Reconstructed SARS-CoV-2 Genomes

In a preprint, a researcher describes his recovery of viral sequences that had been removed from a common database.

Rare Heart Inflammation Warning

The Food and Drug Administration is adding a warning about links between a rare inflammatory heart condition and two SARS-CoV-2 vaccines, Reuters reports.

Sandwich Sampling

The New York Times sent tuna sandwiches for PCR analysis.

Nature Papers Describe Gut Viruses, New Format for Storing Quantitative Genomic Data, More

In Nature this week: catalog of DNA viruses of the human gut microbiome, new dense depth data dump format to store quantitative genomic data, and more.