Researchers from the University of Washington have combined arbitrary single-primer PCR and next-generation sequencing to decode the lineage of individual cells in limited starting material — on the order of several hundred cells.
The researchers, who evaluated the method on a model lineage of cultured mouse cells, believe it could serve as a stopgap approach for inferring cellular ancestry until it becomes "economically feasible to sequence progressively larger portions of the genome from greater numbers of single cells," they wrote in a description of the method published online last month in Nature Methods.
The group has been developing the method as a way to infer a cell’s mitotic history from its accumulated somatic mutations. They hope such lineages will allow researchers to study cell division during normal development and differentiation as well as the aberrant division in diseases like cancer.
Marshall Horwitz, a U of Washington pathologist and leader of the lab behind the sequencing method, told In Sequence that looking to somatic mutations to provide an accurate picture of a cell’s lineage is not new. However, he said, his group believes it is the first to adopt sequencing for this purpose, combining arbitrarily primed PCR with sequencing using Life Technologies' SOLiD platform.
“It’s a pretty simple idea — that DNA mutations are inevitable, and every time a cell divides, mutations accumulate in the genome of individual cells, thereby recording a history,” Horwitz said.
“What separates this paper is that this is the first time, to our knowledge, anybody has used sequencing to do [this], whereas in the past we [and other groups] have used particular defined base pair sequences to look for microsatellite repeats” using PCR and capillary electrophoresis, he said.
Ideally, Horwitz said, researchers would sequence the genomes of individual cells, and catalogue all their mutations. "But it's not practical, yet, although the technology is getting better and better," he said.
And while Horwitz and colleagues note in their paper that some groups have successfully used whole-genome amplification and single-cell sequencing to recapitulate cell lineages, they claim that WGA "leads to inconsistent and unfaithful amplification … across interrogated sites such that the inferred lineages are based on few data and are therefore unreliable." Furthermore, they said that such WGA approaches have only been used for tumor cells, so it "remains uncertain if this will work for normal tissues lacking large-scale, cancer-specific genomic alterations."
An Arbitrary Approach
The U Wash researchers hypothesized that sequencing just some of the genome might allow them to identify enough somatic mutations to trace out a family tree without needing to sequence the whole genomes of individual cells. They hit on arbitrary single-primer PCR as a way to sequence portions of the genome in a more random fashion than targeted sequencing methods.
"The question then was, 'What part of the genome should we look at?' Most [technologies] for capturing are designed to look at specific sequences in genes," Horwitz said. "But we figured most of the mutations that occur somatically are going to exist in intergenic places, so we didn't want to really bias it with any kind of preconceived notion … and we thought arbitrary PCR would be great for this."
With arbitrary PCR, he said, "you can kind of dial the amount of the genome you want to amplify with a single primer just by altering the length of the primer — the longer the primer, the less frequently it's going to hybridize."
The group did several simulations, Horwitz said, testing a number of primers and hitting on one 10-mer oligonucleotide as the best prospect.
In their Nature Methods report, they describe constructing a cultured mouse cell lineage to evaluate the accuracy of the approach. "We felt it was important to construct a lineage in vitro and then to try out the whole approach to see if we could validate it," Horwitz said.
Over the course of several months, the researchers cultured a single cell into successive generations, extracting DNA from each step of the model tree.
"The net result was, we had a defined lineage we could then test … we set up an experiment where we know the right answer and could see if we could get it [using arbitrary PCR and sequencing]," Horwitz said.
The researchers tested the method against the constructed tree using dilutions corresponding to approximately 100 cells. After performing PCR with the arbitrary primer, the researchers sequenced PCR products, generating 23 gigabases of sequence, of which 72 percent mapped to the reference genome. Arbitrary PCR showed sample-to-sample consistency across whole chromosomes and reproducibly amplified sequences common to all samples, the authors reported.
"It's important that the PCR is really robust … from sample to sample," Horwitz said. "You want to make sure you're amplifying the same fragments from one sample to the next because if you aren't, you won't get much sequencing information in common."
The group identified 592 mutations, of which 315 demonstrated segregation consistent with the known cultured lineage. When the researchers attempted to reconstruct the tree using either Bayesian or neighbor-joining approaches, the inferred lineages were 79 percent and 75 percent identical to the real tree, respectively, the authors wrote.
If the method is applied to cells extracted from an individual organism, only the terminal nodes of the tree would be available, since the intermediate nodes would not be known as they were in the case with the cultured cells. As a result, the researchers also tested their ability to reconstruct the cell's phylogeny using only the terminal nodes. In this experiment, the group was able to perfectly match the tree's known lineage, the researchers reported. "We achieved greater accuracy … because there was more common sequence and greater mitotic distance, with a greater likelihood of mutation, separating sampled nodes," they wrote.
Horwitz said that the group's arbitrary PCR and sequencing method is simpler, more elegant, and more robust than their previous attempts to infer cell lineage through molecular analysis.
"In some of the prior work we did with poly-guanine tracks, we did something like over 100,000 PCR reactions using robots, and 384-well plates, and for this you only need to do one library preparation, and we divided the flow cell into eight segments, so you could also multiplex, and the throughput is really greatly advanced," he said.
While he acknowledged that "there is the potential to introduce errors in the form of mistakes made during PCR by the polymerase [as well as] sequencing errors," Horwitz stressed that "it's still much more robust than looking at length-altering mutations and microsatellite repeats."
"That requires a little bit subjectivity in interpretation … whereas here, you have really objective data," he said.
The team believes their approach could be used to derive a mouse or human cell lineage tree in the manner of John Sulston's early fate-map for Caenorhabditis elegans.
Horwitz said the group thinks the method is also robust enough to start asking "lots of interesting questions" about other organisms.
"In the paper, we describe particular conditions for the mouse, so what it means is, it's going to be a little more challenging [to adapt the method for another organism]," he said. "You might have to look at more sequence data in order to compile an appropriate number of mutations to get adequate information for constructing lineage."
"But for starters we wanted to make it as simple as possible," he said.
In unpublished work, the group has already adapted the method toward the human genome, Horwitz said, which "didn't take much effort because it's roughly comparable in size."
The lab previously applied its earlier approaches using poly-guanine tracks to ulcerative colitis, showing it was possible to predict the development of cancer by tracking clonal waves sweeping through patches of a patient's colon.
Now the group is repeating all its ulcerative colitis experiments with the deep-sequencing approach, Horwitz said.
Beyond the lineages themselves, Horwitz said there is also potential to glean information from the actual mutations his team and other lineage methods find.
"I would argue that there are a lot of interesting results you're going to get just from looking at somatic mutation profiles, and reconstructing cell lineage, but not to be overlooked is the wealth of information from the mutations themselves in terms of mutagenic mechanisms and how that relates to cancer and other diseases," he said.
Researchers have used other techniques for mapping cell lineages in recent studies of cancer cells. According to Horwitz, while his group's arbitrary sequencing method could also be used to study cancer, some of the other approaches used in cancer studies might not be similarly adaptable to studying non-malignant cells, which he said makes his group's method more broadly applicable.
In a recent Nature study, for example, researchers at Cold Spring Harbor Laboratory used deep sequencing with whole-genome amplification to trace copy number variation and track tumor evolution.
Horwitz said that while this might work well for cancer, and was a great study, "it's not clear that those sorts of mutational events happen enough in any circumstance except for malignancy."
Such mutations, he added, "just wouldn't be compatible with most cells or organism development."
Horwitz described the method as a temporary option, however. "Ten years from now, I think this will all be quite trivial," as technology for single-molecule and single-cell sequencing advances.
He noted that, in principle, single-molecule approaches like the Pacific Biosciences RS should eventually allow researchers to get down to the resolution of a single cell without having to do any amplification at all.
Have topics you'd like to see covered in In Sequence? Contact the editor at href="mailto:[email protected]">mashford [at] genomeweb [.] com.