SAN FRANCISCO (GenomeWeb) – In an effort to spur the adoption of next-generation sequencing in forensics, the National Institute of Standards and Technology has published US population sequence data for 27 autosomal short tandem repeats, including the 20 core loci used in the Federal Bureau of Investigation's Combined DNA Index System (CODIS) database.
The NIST researchers published the STR sequences for 1,036 individuals in the journal Forensic Science International: Genetics last month.
"NGS has come a long way in the last 10 years, but the forensics community has been slow to adopt the newer technology," said Katherine Gettings, senior author of the study and a scientist in NIST's Applied Genetics Group.
There are multiple reasons for that, she said, including that the technology has not yet been validated for use in casework. In order to speed up adoption, the NIST study aims to address one of the gaps — the lack of so-called match statistics.
Analyzing DNA for forensics applications involves characterizing STRs — genomic regions made up of several nucleotides that are repeated over and over again. To do that, researchers determine the number of repeats present at a given loci. This is typically done using capillary electrophoresis or PCR-based fragment analysis, which measure the length of the repeats. There are known population frequencies of STR profiles that researchers use to calculate the likelihood that a given profile is a match to that of a specific person, or whether the two profiles could have been similar due to chance.
These match statistics have been calculated for STR lengths, but "there are also DNA sequences under there," Gettings said, that could have greater discriminatory power. For instance, the additional data offered by NGS could help parse mixed samples containing DNA from multiple individuals or generate more information than size-based STR profiling for samples in which the DNA is highly degraded.
In earlier work, NIST researchers established match statistics for traditional STR profiling using a set of DNA samples from 1,036 individuals spanning Caucasian, African American, Hispanic, and Asian ancestries in order to establish population frequencies for the US. In the recent study, the team used the same set of 1,036 samples to generate sequence data at the STR loci, sequencing the same core 20 loci that are included in the CODIS database.
The NIST publication should help labs that are interested in converting to NGS validate their pipelines, Gettings said. Labs could use the data to retrospectively analyze previous casework, for example, to see how beneficial sequencing could be, since having the sequence information might have led to better identification.
Anytime a forensic sample is a match to a sample in a database, researchers have to put statistical weight behind that match, Gettings said, determining how common or rare it would be for the two different samples to match at a specific locus or loci. To do that, population frequencies are needed, and the population frequencies of the published STR sequences are one step toward being able to use NGS in traditional forensics labs.
In the study, the researcher used the Illumina MiSeq FGx sequencing system and the ForenSeq kit, which was developed by Illumina but is now sold by Verogen. Illumina researchers have also validated the ForenSeq kit and MiSeq FGx instrument with Scientific Working Group on DNA Analysis Methods (SWGDAM), a group of around 50 scientists that represent federal, state, and local forensics laboratories in the US and Canada, a necessary step in order to use the technology in a court case.
Gettings said that the NIST group plans to do similar validation work using Thermo Fisher Scientific's NGS technology. Thermo Fisher launched its PrecisionID GlobalFiler NGS STR panel, as well as two mitochondrial DNA panels, in 2016, to be run on either the S5 or the S5 XL NGS system.
Thermo also has a large presence in the more traditional PCR and CE sequencing-based forensics market, and the FBI has approved a number of those products for use by labs generating DNA profiles for the CODIS database.
The NIST team also plans to analyze the 1,036 samples using an NGS kit developed by Promega, called PowerSeq.
For the last several years, researchers in the US and Europe have been looking to develop NGS-based methods for forensics applications. Such work has included both performing STR profiling with NGS and profiling SNPs or mitochondrial DNA, which could help identify age, ancestry, hair color, and other defining features of a person.
For instance, the European Forensic Genetic Network of Excellence (EUROFORGEN-NoE) consortium has validated a SNP-based NGS panel for ancestry identification and is looking at the potential of using RNA or microRNA to determine the tissue of origin of a sample. One facet of the group's work has focused on the ethical implications and privacy risks of implementing more advanced NGS-based DNA profiling methods.
Other research groups in the US have also been developing NGS-based forensics methods that involve analyzing SNPs or mitochondrial DNA, as opposed to STRs. Gettings said that expanding beyond STRs to markers like SNPs and DNA methylation holds a lot of promise, but that it would be a while before they are implemented. The CODIS database "has more than 17 million STR profiles, so we're stuck with that for the current foreseeable future," she said. Most likely, NGS will eventually be used for STR profiling, Gettings said, and the recent publication is one step toward making that possible. The next step is to get the technology accepted by the courts in order to enable it to be used in criminal casework.
For that to happen, Gettings said, the technology would first have to go through either a Frye or a Daubert hearing — processes by which new scientific evidence is determined to be admissible in a court of law. Experts would discuss the technology and how widespread its adoption is, and testify to its validity. A judge would then rule on whether it is acceptable and publish an opinion on it, Gettings explained. She said that although she is unaware of any pending cases, it could happen any day, and she noted that some forensics labs are already using NGS methods in missing persons cases.