NEW YORK (GenomeWeb) – Researchers in New York have developed a means of quickly identifying human DNA samples using a portable sequencer.
The approach developed by the New York Genome Center's Yaniv Erlich and his colleagues relies on Oxford Nanopore's MinIon sequencer and a Bayesian algorithm to compare a random mix of variants uncovered through sequencing to a database of known genetic profiles. As they reported this week in the journal eLife, the MinIon sketching approach could reidentify humans from DNA in about three minutes of sequencing using between 60 and 300 random SNPs.
This, the researchers said, could open up near real-time DNA authentication with applications ranging from identifying victims of mass disasters to recognizing mislabeled or contaminated cell lines.
"Our method opens up new ways to use off-the-shelf technology to benefit society," Erlich, who is also a professor at Columbia University, said in a statement.
In this approach, dubbed MinIon sketching, DNA is prepared for either 1D or 2D shotgun sequencing on the MinIon, a USB-compatible handheld DNA sequencer. Variants in aligned reads are then compared to a reference database, and a Bayesian algorithm calculates the probability that the sample being analyzed matches or does not match entries in the database being queried. That algorithm updates the probability of a match as more markers are analyzed.
Erlich and his colleagues said that their approach specifically avoids using PCR to decrease the time in which samples can be prepared, while also reducing the number of steps and avoiding the species bias that human-specific primers introduce. Meanwhile, they added, the Bayesian algorithm compensates for the noise of low-coverage and error-prone sequencing.
To test the MinIon sketching approach, Erlich and his colleagues built two large reference databases, one containing 31,000 genome-wide genotyping array files from people who used direct-to-consumer genetic testing companies like 23andMe, AncestryDNA, and FamilyTree DNA, and one of about 800,000 SNP genotyping array files from 1,099 cancer cell lines from the Cancer Cell Line Encyclopedia.
Using the first database as a reference, the researchers tested their ability to reidentify human DNA samples. With the R7 MinIon chemistry, they could reidentify a sample of an Ashkenazi-Uzbeki male individual within 13 minutes using 110 SNPs. That speed increased to less than five minutes when they used the R9 chemistry to test samples from a Northern European female individual and a Northern European-Italian-Ashkenazi male individual, and relied on between 98 SNPs and 134 SNPs.
Erlich and his colleagues estimated that between 91 SNPs and 195 SNPs are needed to generate a MinIon sketch match of 99.9 percent. They also calculated that the sample is not in the reference database if no 99.9 percent match is found when using 300 SNPs.
Then with the second database and R9 chemistry, the researchers tested how well they could identify contaminated cell lines. They first authenticated the monocytic leukemia strain THP1, which took about three minutes and 91 SNPs, and then mixed that strain in equal amounts with another human sample. For this contaminated sample, the algorithm found no match. They further reported that for samples with more than 25 percent contamination, the algorithm would not return a match.
Erlich and his colleagues said that when researchers currently do actually authenticate their cell lines, they generally use STR-based identification and send out samples to the American Type Culture Collection. With shipping and analysis time, it can take two weeks to get results and the researchers said that in that time, previously low levels of contamination could blossom to be a predominant strain.
The researchers argued that their approach is a faster and — with a $1,000 startup cost for the MinIon — more cost-effective way to monitor cell lines.