Skip to main content
Premium Trial:

Request an Annual Quote

NIST Publication of STR Sequence Frequency Data Paves Way for NGS in Forensics


SAN FRANCISCO (GenomeWeb) – In an effort to spur the adoption of next-generation sequencing in forensics, the National Institute of Standards and Technology has published US population sequence data for 27 autosomal short tandem repeats, including the 20 core loci used in the Federal Bureau of Investigation's Combined DNA Index System (CODIS) database.

The NIST researchers published the STR sequences for 1,036 individuals in the journal Forensic Science International: Genetics last month.

"NGS has come a long way in the last 10 years, but the forensics community has been slow to adopt the newer technology," said Katherine Gettings, senior author of the study and a scientist in NIST's Applied Genetics Group.

There are multiple reasons for that, she said, including that the technology has not yet been validated for use in casework. In order to speed up adoption, the NIST study aims to address one of the gaps — the lack of so-called match statistics.

Analyzing DNA for forensics applications involves characterizing STRs — genomic regions made up of several nucleotides that are repeated over and over again. To do that, researchers determine the number of repeats present at a given loci. This is typically done using capillary electrophoresis or PCR-based fragment analysis, which measure the length of the repeats. There are known population frequencies of STR profiles that researchers use to calculate the likelihood that a given profile is a match to that of a specific person, or whether the two profiles could have been similar due to chance.

These match statistics have been calculated for STR lengths, but "there are also DNA sequences under there," Gettings said, that could have greater discriminatory power. For instance, the additional data offered by NGS could help parse mixed samples containing DNA from multiple individuals or generate more information than size-based STR profiling for samples in which the DNA is highly degraded.

In earlier work, NIST researchers established match statistics for traditional STR profiling using a set of DNA samples from 1,036 individuals spanning Caucasian, African American, Hispanic, and Asian ancestries in order to establish population frequencies for the US. In the recent study, the team used the same set of 1,036 samples to generate sequence data at the STR loci, sequencing the same core 20 loci that are included in the CODIS database.

The NIST publication should help labs that are interested in converting to NGS validate their pipelines, Gettings said. Labs could use the data to retrospectively analyze previous casework, for example, to see how beneficial sequencing could be, since having the sequence information might have led to better identification.

Anytime a forensic sample is a match to a sample in a database, researchers have to put statistical weight behind that match, Gettings said, determining how common or rare it would be for the two different samples to match at a specific locus or loci. To do that, population frequencies are needed, and the population frequencies of the published STR sequences are one step toward being able to use NGS in traditional forensics labs.

In the study, the researcher used the Illumina MiSeq FGx sequencing system and the ForenSeq kit, which was developed by Illumina but is now sold by Verogen. Illumina researchers have also validated the ForenSeq kit and MiSeq FGx instrument with Scientific Working Group on DNA Analysis Methods (SWGDAM), a group of around 50 scientists that represent federal, state, and local forensics laboratories in the US and Canada, a necessary step in order to use the technology in a court case.

Gettings said that the NIST group plans to do similar validation work using Thermo Fisher Scientific's NGS technology. Thermo Fisher launched its PrecisionID GlobalFiler NGS STR panel, as well as two mitochondrial DNA panels, in 2016, to be run on either the S5 or the S5 XL NGS system.

Thermo also has a large presence in the more traditional PCR and CE sequencing-based forensics market, and the FBI has approved a number of those products for use by labs generating DNA profiles for the CODIS database.

The NIST team also plans to analyze the 1,036 samples using an NGS kit developed by Promega, called PowerSeq.

For the last several years, researchers in the US and Europe have been looking to develop NGS-based methods for forensics applications. Such work has included both performing STR profiling with NGS and profiling SNPs or mitochondrial DNA, which could help identify age, ancestry, hair color, and other defining features of a person.

For instance, the European Forensic Genetic Network of Excellence (EUROFORGEN-NoE) consortium has validated a SNP-based NGS panel for ancestry identification and is looking at the potential of using RNA or microRNA to determine the tissue of origin of a sample. One facet of the group's work has focused on the ethical implications and privacy risks of implementing more advanced NGS-based DNA profiling methods.

Other research groups in the US have also been developing NGS-based forensics methods that involve analyzing SNPs or mitochondrial DNA, as opposed to STRs. Gettings said that expanding beyond STRs to markers like SNPs and DNA methylation holds a lot of promise, but that it would be a while before they are implemented. The CODIS database "has more than 17 million STR profiles, so we're stuck with that for the current foreseeable future," she said. Most likely, NGS will eventually be used for STR profiling, Gettings said, and the recent publication is one step toward making that possible. The next step is to get the technology accepted by the courts in order to enable it to be used in criminal casework.

For that to happen, Gettings said, the technology would first have to go through either a Frye or a Daubert hearing — processes by which new scientific evidence is determined to be admissible in a court of law. Experts would discuss the technology and how widespread its adoption is, and testify to its validity. A judge would then rule on whether it is acceptable and publish an opinion on it, Gettings explained. She said that although she is unaware of any pending cases, it could happen any day, and she noted that some forensics labs are already using NGS methods in missing persons cases.

The Scan

Octopus Brain Complexity Linked to MicroRNA Expansions

Investigators saw microRNA gene expansions coinciding with complex brains when they analyzed certain cephalopod transcriptomes, as they report in Science Advances.

Study Tracks Outcomes in Children Born to Zika Virus-Infected Mothers

By following pregnancy outcomes for women with RT-PCR-confirmed Zika virus infections, researchers saw in Lancet Regional Health congenital abnormalities in roughly one-third of live-born children.

Team Presents Benchmark Study of RNA Classification Tools

With more than 135 transcriptomic datasets, researchers tested two dozen coding and non-coding RNA classification tools, establishing a set of potentially misclassified transcripts, as they report in Nucleic Acids Research.

Breast Cancer Risk Related to Pathogenic BRCA1 Mutation May Be Modified by Repeats

Several variable number tandem repeats appear to impact breast cancer risk and age at diagnosis in almost 350 individuals carrying a risky Ashkenazi Jewish BRCA1 founder mutation.