NEW YORK – Researchers from Albert Einstein College of Medicine have developed a new method, called single-molecule mutation sequencing (SMM-seq), for detecting ultrarare mutations in cells and tissues that they say is on par with single cell-based approaches.
The technique, published in Science Advances last week, is a variation of duplex sequencing that further reduces errors by generating multiple independent copies of each strand of parental DNA via rolling circle amplification prior to amplification by PCR and sequencing.
Duplex sequencing in general involves separating each strand, sequencing them independently, and comparing the sequences of the top and bottom strand to spot errors. But individual strands sometimes get lost, thereby limiting the number of DNA fragments that can be accurately evaluated for sequence variations.
In contrast, SMM-seq involves ligating hairpin-like adapters to each end of a double-stranded DNA fragment, forming a circle. Each adapter contains a unique molecular identifier and a rolling circle primer binding site. Polymerase can then bind at either site and copy both strands in a circular fashion, resulting in a single, linearly amplified strand with a common linker sequence between each strand copy. The copies can be separated at their linkers for individual PCR amplification when generating a sequence library.
"Since the copies originate from the same template, an error created during this rolling circle amplification procedure will be unique for each copy," explained Alexander Maslov, the study's first author, and these artifacts can then be filtered out in downstream analyses.
After establishing that SMM-seq could accurately track mutation rates in cells exposed to low doses of a mutagen, Maslov and his colleagues used it to identify age-related somatic variation in cells taken from tissue samples.
"I've been interested all my life in the process of aging and how it could be caused by the accumulation of somatic mutations," said Jan Vijg, co-corresponding author with Maslov and head of the lab at Albert Einstein where SMM-seq was developed.
Identifying and understanding these mutations at the whole-genome level had been "almost impossible" prior to the advent of single-cell and NGS technologies, Vijg explained, because so many mutations occur randomly in individual cells.
"[They] will accumulate differently in each cell and every mutation will differ from cell to cell," he said.
Reanalyzing whole-genome data from the team's recently published study on age-related mutational load in the human liver, the researchers found that SMM-seq identified a mutational load comparable to that seen via the gold standard single-cell approach.
SMM-seq's accuracy is based on the probability of an error occurring in each independent strand copy. Essentially, a true mutation would be expected to appear on every copy made from the parental strands.
In estimating the optimal analysis parameters for SMM-seq, Maslov and his colleagues compared the mutation frequencies associated with using different numbers of rolling circle strand copies to filter out PCR- and sequencing-induced mutations. SMM-seq's accuracy stopped improving significantly after seven copies, while using just two copies "recapitulates the duplex sequencing approach," Maslov said.
Although this should improve SMM-seq's accuracy over that of duplex sequencing, a direct head-to-head comparison remains to be done.
Such direct comparisons are planned, but Maslov argued that for now, the group's calculation of how mutation frequency varies by the number of parental strand copies provides a theoretical comparison.
"This is a neat form of duplex sequencing," Jesse Salk, CEO and cofounder of TwinStrand Biosciences, said via email. He added that TwinStrand, which has commercialized duplex sequencing, has worked on this kind of technique itself, aspects of which are included in several company-held patents.
"There are some practical complexities to using this particular form of linking strands," he said. "The one-to-one connection of the strands conceptually helps reduce amplification 'dispersion,' where one amplified better than the other. But the linking of strands also adds certain inefficiencies related to read lengths and hybrid capture and the types of polymerases required for [rolling circle amplification], among other things."
SMM-seq joins other variations of duplex sequencing, such as CODEC, a method that is currently going through peer review and also physically links each strand of a DNA duplex to create a longer "single duplex." In contrast to SMM-seq's rolling circle amplification, CODEC focuses on improving the efficiency of duplex sequencing that would otherwise be constrained by having to sequence each strand separately. It does so by ligating both strands and sequencing them as a single unit, thereby avoiding the need to computationally reassociate reads from within a sequencing library.
Vijg, Maslov, and their colleagues have filed for a patent on SMM-seq and intend to commercialize it, although they said it is too early to discuss specific plans.
In order for that to happen, "we will need to validate extensively against duplex sequencing and other assays, and we need to test many known mutagens," Vijg said.
In the meantime, the Albert Einstein group is collaborating with a number of academic institutions. One collaboration with researchers at the University of Rochester, for example, involves using SMM-seq to compare DNA repair mechanisms in rodent species with very different lifespans, such as mice and naked mole rats, and how each is affected by various spontaneous mutations.
The team also plans to experiment with using SMM-seq to screen people for susceptibility to mutagenic factors that could contribute to disease, such as how some smokers get lung cancer while others don't.
"We [want to find] people more or less susceptible to mutagens and to the accumulation of mutations, either with age or upon exposure to mutagenic factors," Maslov said. "Maybe we can eventually use that as a prognostic factor for the development of certain diseases."
"It is great to see capable academic groups working on different variations on duplex sequencing," Salk commented, "which says a great deal about the utility of the method."