Skip to main content
Premium Trial:

Request an Annual Quote

UNC Group Refines Framework for Returning Incidental Results from Genome Sequencing


Researchers at the University of North Carolina at Chapel Hill have tested a strategy for returning incidental findings from genome sequencing to patients and plan to use this approach in an ongoing research study.

Their strategy uses a structured framework that divides the genome into categories or "bins" prior to the analysis, based on clinical utility and validity. Patients in the study will always receive clinically actionable information, and a subset may elect to obtain additional information according to their preferences.

With the help of the framework, which the researchers first outlined last year, "you can do a very efficient job of counseling patients about what to expect, and analyzing and returning their data," said Jonathan Berg, an assistant professor of genetics at UNC, who helped develop the scheme. "This framework is partly about dividing up the genome so people can make informed choices about what they want to learn."

Berg is also the lead author of a recent publication in Genetics in Medicine that describes the first application of the framework to genomic data from 80 individuals.

For that study, the researchers selected 2,016 genes from the Online Mendelian Inheritance in Man database that are associated with Mendelian disorders.

They then reviewed these genes manually and placed them into different categories.

Bin 1 contains 161 clinically actionable genes, for which there is "a reasonable suggestion of beneficial interventions."

Bin 2c contains 57 genes with "potentially significant risk of [causing] psychosocial harm," for example genes for Huntington's and prion diseases. "We feel there is a very small subset of genes where we ought to take special care in terms of making sure that someone is fully aware of what they are asking for when they say, 'I'd like to get my information about these disorders,'" Berg explained.

Bin 2b harbors the remaining 1,798 genes, which have clinical validity but no clinical intervention or significant risk of harm from learning about them.

The study did not consider any genes or variants not associated with Mendelian diseases, such as GWAS risk or pharmacogenomics variants, which would otherwise have gone into bin 2a.

Berg stressed that the gene placements are preliminary, and that genes may switch between categories in the future as more information is available about their function. Placing genes in bin 2c, in particular, was a judgment call, he said, and patients may still decide to learn about these. "It's not about restricting access to information but more about making sure that the access to that information is informed."

Using the genomes of 80 individuals – 61 genomes from presumably healthy people made available by Complete Genomics and 19 from patients enrolled in a research study on hereditary cancer susceptibility – the scientists then devised criteria to select variants in the 2,016 genes that should be manually reviewed. It would be impossible to review all variants in those genes, Berg said, because each person contains about 200,000 of them on average.

Their final algorithm chooses variants for review if they have an allele frequency of less than 5 percent and are either protein-truncating or listed as disease-causing in the Human Gene Mutation Database. On average, per person, this flagged 1.5 variants in bin 1 genes, 6.4 variants in bin 2b genes, 0.2 variants in bin 2c genes, and 9.2 variants indicating carrier status for recessive disorders. "That number someone could go through in a reasonable amount of time and make a judgment call about them." Berg said.

By not including most missense variants, the analysis loses sensitivity, meaning it might not pick up all potentially disease-relevant mutations in the 2,016 genes. But for practical reasons, not all missense variants can be included, Berg said. "You want to find the things that are clearly disease-causing without overburdening either the lab or the informatics person or the clinician or the patient with information that's of unclear clinical relevance," he said. "This is a streamlined analysis for incidental findings."

He said that for a diagnostic test where an inherited disease is suspected, the analysis would be different – in that case, it would include variants of unknown significance. However, "what we are dealing with here is a way to strip down the incidentalome to just those key variants that are clinically important."

Also, when looking for Mendelian conditions, most of which are very rare, it is more important to be specific than sensitive. "You have to be absolutely certain that the variant is a disease-causing one," he said. And because the probability of someone having a rare disease is small, "even if your test is only 80 percent sensitive, your negative predictive value is still extremely good."

Like other research groups before them, Berg and his colleagues found that "a sizeable number" of missense mutations listed as disease-causing in HGMD have an allele frequency greater than 5 percent, meaning they are probably not linked to disease. "It does raise some concern about why they are in the database," he said, and highlights the need for a clinical-grade variant database.

The first project the researchers plan to apply their method to is the North Carolina Clinical Genomic Evaluation by Next Gen Exome Sequencing study, for which UNC received $6.4 million in funding from the National Human Genome Research Institute last year under the Clinical Sequencing Exploratory Research Project program (CSN 1/25/2012).

That study will enroll at least 750 patients suffering from cancer, dysmorphic birth defects, and neurodegenerative diseases. One group of patients will only receive actionable results from their exome data, or bin 1 findings, while the other will be able to choose which additional incidental findings to receive. "We will be studying who decides to get different types of information, why they made this decision, and what the impact was," Berg said.

The first dozen or so patients were enrolled in a pilot phase and were not given access to incidental findings, Berg said, but the group is now in the process of enrolling participants for the randomized part of the study.

Berg said one of the key advantages of their approach is that "you know exactly what types of variants you will get and you know exactly what types of variants you won't get."

"We know exactly how we are going to do the analysis in every single person every single time; it's going to be very reproducible," he said, "whereas if you're dealing with these things on an ad hoc, case-by-case basis, there is definitely room for some variation" in how the same finding is dealt with in different people.

It is also possible to adapt their framework for use in special patient groups, for example in children or cancer patients.

There is still room for improving the variant selection algorithm, though. For example, Berg said, there are some genes in which almost all disease-causing mutations are known to be missense mutations, or where missense mutations are known to cluster in a certain region. "We continue to refine this method based on what we know about the molecular causation of disease," he said.

Other groups within the CSER program take different approaches for returning incidental findings, he said. Some, for example, plan to return all incidental findings without distinguishing between different types.

Ingrid Holm, a principal investigator of the Gene Partnership project of Boston Children's Hospital, said that their project also plans to return incidental results from genome sequencing to participants, guided by a so-called Informed Cohort Oversight Board.

That board will use a similar approach to UNC's, she said, returning results according to patients' preferences, and overruling those preferences "if there would be imminent harm without knowing" the results, which would be equivalent to bin 1 results in UNC's framework.

"Essentially, I don't think it's necessarily that different a type of approach," she said.

Holm said that UNC's binning method "makes sense" but that the categories will need to be updated frequently as new information about disease variants surfaces, "and there are going to be things that come up in someone's genome that you don't have in your bins.

"It’s a very logical and interesting approach, and it's a good start," she said. "You are not going to be able to go back to the drawing board for every single genome that comes in and figure out what's there, and what you return or not. It's going to need some kind of algorithm to sort that out."

The Scan

Missed Early Cases

A retrospective analysis of blood samples suggests early SARS-CoV-2 infections may have been missed in the US, the New York Times reports.

Limited Journal Editor Diversity

A survey finds low diversity among scientific and medical journal editors, according to The Scientist.

How Much of a Threat?

Science writes that need for a provision aimed at shoring up genomic data security within a new US bill is being questioned.

PNAS Papers on Historic Helicobacter Spread, Brain Development, C. difficile RNAs

In PNAS this week: Helicobacter genetic diversity gives insight into human migrations, gene expression patterns of brain development, and more.