European Consortium to Validate NGS Toolkit for Analyzing Phenotypes in Forensics

NEW YORK (GenomeWeb) — A European consortium of forensic geneticists is working to develop and validate standard next-generation sequencing tools for DNA phenotyping and biogeographical ancestry testing.

The Visible Attributes through Genomics (VISAGE) Consortium involves forensic geneticists at academic research and police laboratories from eight European countries. The project, which launched last year with EUR 5 million ($6 million) over four years from the Horizon 2020 EU research and innovation program, aims to validate NGS tools for identifying the appearance, age, and ancestry of unknown perpetrators using DNA evidence.

The VISAGE collaborators are in the process of validating a basic toolkit that includes currently available markers on eye color, hair color, and skin color; provides information about the continental ancestry of a suspect; and employs DNA methylation analysis to determine biological age. Once the basic toolkit is validated, "it should establish a common platform for using NGS-based phenotyping" among police labs, said Peter Schneider, a professor of forensic molecular genetics at the University Hospital Cologne's Institute of Legal Medicine.

The launch of the project coincides with efforts in Germany to amend the Code of Criminal Procedure to allow DNA phenotyping and biogeographical ancestry testing to investigate serious crimes where the perpetrator is unknown (see related story).

These efforts have garnered pushback from a group of scientists, ethicists, and lawyers led by Veronika Lipphardt, a professor of science and technology studies at University College Freiburg, who caution that these tests could become tools for profiling and discriminating against minorities if they're implemented in criminal investigations without accounting for their limitations and before putting the appropriate checks and balances in place. Her group is also concerned that the public discussion of these technologies to date have been overly optimistic about their capabilities without discussing their limitations.

"These tests do not need to be 100 percent precise," Lipphardt said. "We know that this would be an unrealistic and even naive expectation. But if you convince the public with some unrealistically high probabilities, you raise very high expectations of what the technologies can do."

In Schneider's view, forensic DNA phenotyping should be used rarely. In Germany, "I would prefer that forensic DNA phenotyping would be restricted in its use by the national code of criminal procedures and be allowed only in capital crimes and sexual offenses, and only if other investigative options have been exhausted," he said.

He acknowledged that if the authorities don't fully understand these technologies, then there is a risk they might apply these tools in ways that are discriminatory. But he is optimistic this can be addressed with the use of validated tools, like the basic toolkit being developed within VISAGE, and through education.

The tools could be an important resource for the forensics community, given the ability of NGS to handle the small, often damaged biological samples that are discovered at crime scenes. Researchers can also use NGS to analyze more markers in larger data sets and further refine the predictive capabilities of these tests.

The VISAGE collaborators are also developing software to streamline and standardize interpretation of the data and are formulating best practices for reporting results. The group has plans to create training courses for the police, investigators, and judges so they understand what the tests can and cannot reveal about a person, the chance for errors, and how to communicate the results.

"Most people don't understand what these techniques can really do," said Denise Syndercombe Court, a professor of forensic science at King's College London. "People think these technologies can tell you something about somebody that's free of error, because they are used to thinking of DNA analysis as providing almost certainty."

However, phenotypic predictions in forensic DNA analysis are reported in terms of probabilities, delineating the likelihood that a person has blue eyes, dark skin, or red hair. These predictive probabilities are calculated based on algorithms that are developed using reference samples from different populations.

Currently, DNA phenotyping is around 96 percent accurate for differentiating dark and light eye color, but less robust for determining intermediate shades like hazel. Similarly, tests can predict accurately if someone has light or dark skin, but gradations of pigmentation in the middle are harder to pin down. Redheads can be predicted with high accuracy, but not that much is known about graying hair, for example. Of course, these algorithms for genetic predictions can't account for environmental factors, such as the sun, which can change skin color, and efforts by people to disguise their natural phenotype, for example by dyeing their hair.

Another difficulty in DNA phenotyping is that the algorithms have been largely developed using samples from European populations, which means that they might not work as well if the person being tested comes from another population, said Syndercombe Court, who conducts research in this area, and collaborates with scientists in VISAGE, but is not part of the consortium. She noted that one of the goals of VISAGE is to address this problem by using samples from different groups.

Biogeographical ancestry testing, meanwhile, can reliably home in on the continent someone is from, for example, a Nigerian person will be predicted as having African ancestry and a Chinese person will be predicted as having East Asian ancestry, but the results are harder to apply in criminal investigations when someone has mixed ancestry. Errors are more likely to happen, according to Syndercombe Court, if your population has significant admixture.

People from the Middle East present a challenge for current biogeographical ancestry tests, because the region's population is influenced by Europe, South Asia, and North Africa. "So, it will be difficult to differentiate them from these populations if we don't have Middle East population in the prediction algorithm. And we don't generally," Syndercombe Court said. "These are not populations that have been easy to gain access to, at least not representative of the whole of the Middle East."

Because the German debate to allow DNA phenotyping and ancestry testing has escalated in step with concerns over the influx of asylum seekers in the country, many of them from the Middle East, it will be important for law enforcement to understand these types of limitations. "If we have an unknown who is from the Middle East, then we obviously won't be suggesting that they are from the Middle East if that population is not part of the prediction algorithm," Syndercombe Court said. "Instead we may suggest that they are from Europe or South Asia … and that prediction may show as being weak, which may suggest uncertainty in the prediction."

In her own research, Syndercombe Court tries to improve biogeographical ancestry predictions by incorporating data from additional populations and by assessing how well the test predicts the ancestry of the resident UK population, which is quite diverse. "What I do know is that when I use our tools to make biogeographical ancestry predictions on people from other parts of the world, it doesn't work as well, probably because the mixed ancestry [there] is different from the mixed ancestry in the UK," she said.

Schneider noted that the biogeographical ancestry tools being developed within VISAGE will be validated using worldwide population data from the 1,000 Genomes Project, the Human Genome Diversity Project, and other newly collected sample sets. The long-term goal of the project is to develop an NGS-based advanced toolkit that includes validated markers from different reference populations and provides more subcontinental resolution.

"We're fully aware that we can only make predictions in populations or subpopulations where we have adequate reference samples," Schneider said, adding that it's important to understand that "the passport is not imprinted into the DNA," so nationality will never be able to be predicted.

Communicating this to politicians in Germany, and law enforcement who want to use these tools, may be a challenge, he acknowledged. "The problem I see is that some of the politicians are driven by public opinion and may have expectations that aren't realistic," Schneider said.

"We have to be able to depend on police understanding the limitations of the evidence," said David Kaye, a law professor at Penn State who teaches about the application of genetic evidence in forensics. He has seen police departments too enthusiastically take up technologies without fully appreciating their limits. For example, "some police officers swear by voice stress analysis to tell whether someone is lying or not, but very few scientists would accept the idea that there is much if anything to that," he said.

Experts said it will be important to communicate to law enforcement that DNA phenotyping and ancestry testing should not be the first strategy in an investigation and, if applied, the information should not be used in a way that publicly casts suspicion on an entire community. "These technologies should never be used for purposes other than intelligence," Syndercombe Court said. "This information can be kept in the toolkit of the police officer when they're trying to prioritize leads in a crime, but should not replace standard DNA profiling or much more useful standard police techniques."

In England and Wales, even though DNA testing for visible characteristics and ancestry is legally permitted, the police hardly ever have to resort to such analysis, because if there is biological sample available in a crime scene, they first look for a match in the national database. "The UK database is large, and that's good, because the matching can be done with high certainty and the searching is done silently," she said.

The Protection of Freedoms Act 2012 required deletion of nearly 1.5 million profiles from UK's national DNA database of individuals who were arrested or charged but not convicted of an offense. But since then, the repository has grown almost to its prior size, and currently contains 5.86 million DNA records collected at crime scenes and from those suspected of crimes. The repository contains profiles of around 10 percent of UK residents, and boasts a 63-percent match rate. However, a match doesn't mean that it necessarily helps in solving a crime.

Similarly, when forensic geneticists talk about the successful application of forensic DNA phenotyping and biogeographical ancestry testing, they may mean that these methods delivered data that was a reliable estimate, or corroborated other investigative findings, said Matthias Wienroth, who researches the ways that technologies and society interact at Newcastle University in the UK. But that doesn't mean that the application of this information in the hands of law enforcement was successfully applied, or contributed to detection and prosecution of the criminal.

For example, in the case of the UK's "Night Stalker," who burglarized and raped elderly women from 1992 to 2009, the police turned to a now-defunct US company called DNAPrint Genomics, which marketed a test it claimed could narrow down suspects based on ancestry. The test predicted that the "Night Stalker" had Caribbean ancestry, and based on the results, the police specifically focused on the Windward Islands, where they even went to investigate.

Delroy Easton Grant, who looks black and was Caribbean, but from Jamaica, was ultimately arrested using traditional police techniques in 2009 and convicted of his crimes. Even though genetic testing provided some clues, the police were not able to use them to identify him. However, because the DNA analysis pointed to the Caribbean region, forensic scientists may say that the application of the technology was successful for their purposes, said Wienroth.

"This is a distinction that political decision-makers and advocates of these technologies among the criminal justice community don't seem to quite understand yet," he noted. "When they hear a forensic scientist say this technology is successful, they think this will be successful for criminal justice purposes. But whether this information is useful for criminal investigation is often an entirely different story, and this distinction needs to be made more apparent."

In the US, when the Combined DNA Index System doesn't turn up a match among the 17.2 million profiles from offenders, arrestees, and crime scene samples, and the investigative trail turns cold, police sometimes turn to other kinds of genetic analysis, like DNA phenotyping and familial searching. Most recently, law enforcement in California used genealogy profiles posted to a public database, called GEDmatch, to apprehend a man they suspect is the "Golden State Killer," who carried out 12 murders and at least 45 rapes.

However, The Associated Press reported that using a different genealogy site, investigators first went after a man whose DNA did not match the samples found at crime scenes and could not have been the killer. According to genetics experts, the misstep was likely due to the use Y chromosome analysis instead of autosomal DNA analysis in familial searching, but it highlights the importance of police understanding the limitations of these tools and the risk for false leads. 

There is also the risk that DNA, like fingerprints, might end up at a crime scene in innocent ways, leading police to pursue the wrong person. This could happen even when there's a "cold hit" to a profile from a convicted-offender DNA database, Kaye noted. Before police use the less exact tools of DNA phenotyping and ancestry testing, however, he thinks there need to be more studies to quantify and validate the probabilities of some of the characteristics they look for.

In the US, there is no legal barrier to collecting DNA from a crime scene and analyzing it for visible characteristics or ancestry, and the police have, from time to time, turned to commercial firms to use DNA phenotyping in cold cases. However, Lawrence Kobilinsky, deputy chair of the forensic science department at John Jay College of Criminal Justice, has found that crime labs aren't doing this kind of analysis outside of research because there are too many questions about the reliability of the technology to identify and exclude groups of people in investigations.

"Despite these concerns, in the US, there's no question that this type of testing is moving forward," he said. "And as soon as NGS-based analysis becomes more feasible, crime labs are going to be doing this stuff."

Lawmakers have asked four direct-to-consumer genetic testing companies to explain their privacy policies and security measures, according to Stat News.

The Trump Administration has proposed a plan to reorganize the federal government, the Washington Post reports.

In Science this week: genetic overlap among many psychiatric disorders, and more.

The Economist writes that an increasing number of scientific journals don't do peer review.