NEW YORK (GenomeWeb News) – A controversial genetic forensics method known as familial searching may be prone to complications in the US and other places where structured populations are present, according to a PLoS Genetics study by researchers from the University of California at Berkeley and the University of Washington.
"When familial searching is applied, especially on a large scale, we're going to see increases in false identification of people from particular groups," first author Rori Rohlfs told GenomeWeb Daily News.
Rohlfs, currently a post-doctoral researcher at the University of California at Berkeley, did doctoral research in senior author Bruce Weir's biostatistics lab at the University of Washington.
In the new study, published online last night, the researchers explored statistical questions related to familial searching — a method for digging through DNA databases to find family members of individuals who left DNA at a crime scene — using data on individuals from Vietnamese, African American, European American, Latino, and Navajo populations.
The team found that the ability to accurately uncover relationships between individuals based on partially overlapping genetic profiles hinges heavily on accurate information on allele frequency at the loci tested and the background population that's assumed in the comparison.
In contrast to exact genetic profile matching, in which suspects are identified through perfect overlap between markers in crime scene DNA and genetic profiles in state or federal databases of individuals arrested for or convicted of certain crimes, familial searching looks for partial matches between crime scene samples and genetic markers in an offender database.
If such a match is found, it may be possible to come up with a suspect in a cold case by looking for individuals related to the genetically similar person in the database.
"The idea is that you can look for profiles that are matching at a lot, but not all of the markers," Rohlfs explained. "That partial match might be due to a close genetic relationship between the person in the database and the person who left the sample."
Though familial searching is widely used in the UK, she and her co-authors explained, the US has been slower to adopt the approach due to concerns over its effectiveness, as well as privacy, civil liberties, and other issues related to its use.
Familial searching has been banned in Maryland and Washington, DC, for example, though it is currently used in a few other states, including Colorado and Virginia. In California, another state permitting familial searching, the approach has already turned up suspects in some high-profile cases. Among them a case involving a Los Angeles serial killer nicknamed the "Grim Sleeper."
Still, because it is a relatively new method, the scientific and statistical subtleties of familial searching are not as well studied as those surrounding forensic identification based on exact genetic matches, Rohlfs said.
"I was particularly interested in how some of the population genetic assumptions might play out differently in familial searching rather than exact identification," she explained.
To begin looking at this in more detail, she and her colleagues did simulations that took into account population structure and population-dependent allele frequencies at loci favored in forensics investigations.
In particular, they focused on 13 standardized short tandem repeats that are genotyped in samples housed in the Combined DNA Index, or CODIS, a federal database that brings together national DNA and offender/arrestee information.
The national database is not tapped in familial searches, Rohlfs said, though state databases used for such searches typically include the same 13 markers. For example, California's DNA database relies on 15 genetic markers, including the 13 CODIS STRs, though older samples in the database may only have genotyping information at the original 13 sites.
Though the selected STRs are used because they are independent from one another and known to be polymorphic in at least some populations, the extent of the genetic variability at these sites shifts depending on the population considered, researchers explained.
Their analyses, based on genotyping data for 150 to 213 individuals each from self-identified Vietnamese, African American, European American, Latino, and Navajo populations, highlighted ways in which lower variability at the alleles or incorrect allele frequency estimates can muddle predictions about relatedness within some populations.
"For some populations these STRs, these microsatellites are not as polymorphic, so there's not as much identifying information," Rohlfs said. "And so even if you have the right population genetic assumptions, you might get more false relative identification for people of that ancestry."
In addition, they found that the background population assumed can be crucial when trying to tease apart general population patterns for the genetic markers from those shared between related individuals.
"If you assume the wrong [background population] — if you assume, say, European American [population] when samples actually come from individuals of Asian descent — then you might be more likely to falsely think that people are related," Rohlfs noted.
Moreover, she added, it may not be immediately clear whether an abundance of false positive matches between a crime scene sample and individuals in an offender database is due to inaccurate population assumptions or simply reflects the presence of a genetic profile that's especially common in a given population.
While the analysis described in the study focused on autosomal markers, the team noted that resolving relationships between individuals could potentially be improved by including additional markers, particularly lineage-informative sites on the Y chromosome or within the mitochondrial genome — a possibility that Rohlfs said she is interested in exploring in the future.
In the meantime, study authors cautioned that "care is warranted in the use and interpretation of familial searching forensic techniques."
"If implemented with the core CODIS loci, familial searching may result in low distinguishability and potentially high false positive rates among certain groups," they wrote.