NEW YORK (GenomeWeb News) – In a feat that may have implications for everything from forensics to securing the privacy of genome-wide association studies, a team of scientists from the Translational Genomics Institute (TGen) in Phoenix and the University of California at Los Angeles have developed a method for assessing complex DNA mixtures.
Their paper, appearing online today in PLoS Genetics, describes how they designed and tested an algorithm for pinpointing one individual’s DNA in a complex genetic mixture using high-density SNP data. Based on their results, they concluded that it’s possible to find an individual of interest in a mixture of hundreds or even thousands of people’s DNA — that, researchers say, could eventually open the door to a whole new way of gathering genetic evidence in law enforcement.
“Within the current forensics setting this is a new way of thinking,” senior author David Craig, associate director of TGen’s Neurogenomics Division, told GenomeWeb Daily News. “It is a fundamental shift in how people approach a crime scene.”
In the past, Craig explained, crime scene investigators couldn’t easily analyze genetic samples containing DNA from multiple people. Although there are some techniques available for teasing apart one person’s DNA from another, once DNA from more than two or three people was mixed together the data become muddled.
“In large part, forensically identifying whether a person is contributing less than 10 percent of the total genomic DNA to a mixture is not easily done, is difficult to automate, and is highly confounded with the inclusion of more individuals,” the team noted in the paper.
So they came up with a new approach for looking at complicated genetic mixtures: integrating allele intensity measurement data for hundreds of thousands of SNPs to characterize shifts in allele probe intensities, comparing the individual of interest to both a reference population and to the mixture.
The researchers tested this approach using a series of simulations followed by experimental validation on a set of complex DNA mixtures. With just 10,000 to 25,000 SNPs it was possible to identify a person of interest even if his or her DNA comprised less than one percent of the total DNA.
Using even more SNPs, they added, it’s possible to identify an individual’s DNA in a complex mixture even when that person’s DNA made up less than 0.1 percent of the total genomic DNA in the sample.
And, Craig said, the technique should work well in a variety of populations. Although there are SNPs that vary from one population to the next, he said, it is actually relatively simple to weed out the ancestrally-relevant SNPs and create an appropriate reference population.
In addition, the researchers found that both the Affymetrix and Illumina platforms produced very similar results when used for application. “There really isn’t a huge advantage with either platform,” Craig said. “Both methods are surprisingly robust.”
Among the potential real-world applications, the researchers noted that this technique holds particular promise for forensics investigations, since it opens the door to analyzing contaminated DNA samples or sampling a large crime scene area.
“It opens up a whole new can of worms of what’s possible to do forensically,” co-author Stanley Nelson, a human genetics researcher who directs the National Institute of Health’s Neuroscience Microarray Consortium at the UCLA site, said in a statement.
But despite the potential benefits, Craig noted that it has been hard to convince the forensics and research communities that the approach really works as easily and effectively as his team claims. “The universal response is, ‘We don’t believe you,’” he said. Now that the paper has been peer-reviewed and published, Craig hopes that will change.
If the new technique is going to catch on in the forensics field, Craig predicted, it will probably be used in a few high level cases and expanded from there if successful. If it proves valuable there, he added, the approach could become widespread within a few years.
The work also brings a new perspective to the question of whether it’s possible to protect individuals’ identities and genetic privacy by pooling data from genome-wide association studies. “We show that you really can’t,” Craig said, adding that he has been pondering alternative methods for maintaining genetic privacy.
“Within GWA studies, there is a considerable push to make experimental data publicly available so that the data can be combined with other studies,” Craig and his colleagues wrote. “Our findings show that such an approach does not completely conceal identity since it is straightforward to assess the probability that a person or relative participated in a GWA study.”