Skip to main content
Premium Trial:

Request an Annual Quote

FDA Plans to Use Public Genetic Variant Databases for NGS Test Regulation But Not Many May Qualify

Premium

NEW YORK (GenomeWeb) – The US Food and Drug Administration released a draft guidance last week laying out a process by which next-generation sequencing test developers can use publicly accessible genetic variant databases recognized by the agency to establish the clinical validity of such tests. 

Industry insiders GenomeWeb spoke to said public variant databases could lessen the regulatory burden for NGS test developers, but also pointed out that there aren't too many databases available right now that would meet the criteria FDA has proposed and that there is a lack of incentives for submitting to and maintaining such resources.

The agency released the clinical validity draft guidance alongside proposed recommendations for demonstrating the analytical validity of germline NGS tests. Both documents contain preliminary recommendations that, when finalized, would be voluntary for test developers to take up, according to the FDA. 

In the document on clinical validity, the FDA discusses the features it will look for in public databases when deciding whether to accept evidence on variants from them, and describes the process by which database administrators can apply for recognition of a variant database. Using this approach, the agency is hoping to streamline the process for demonstrating clinical validity of variants gauged by NGS tests, encourage data sharing in such repositories, and advance the field's collective understanding of gene-disease relationships.

"Given the volume of data generated by NGS-based tests and the rarity of many genetic variants, it would be inefficient for each test developer to generate its own dataset sufficient to support its claims of clinical validity," FDA spokesperson Lindsay Meyer told GenomeWeb. "The capability in the future to draw upon a high-quality base of aggregated data, such as an FDA-recognized genetic database, will streamline developers’ efforts to make tests available to patients."

Although the FDA has traditionally required test developers to submit data from clinical trials, scientific studies, and case histories to establish the association between a genetic variant and a phenotype, this is unsustainable as more NGS tests come to market, since these types of tests can gauge millions of variants across many genes at once and can be used to diagnose diseases, assess predisposition to complex conditions, and predict the likelihood of response to treatments.

"While current regulatory approaches are appropriate for conventional diagnostics that detect a single disease or condition (such as blood glucose or cholesterol levels)," the FDA explained, "these new sequencing techniques contain the equivalent of millions of tests in one."

Instead of conducting clinical validity studies on specific variants, the draft guidance opens up a process by which test developers could reference variant data in FDA-recognized databases. Administrators that want FDA recognition for a database would have to submit documentation on aspects, such as standard operating procedures, data privacy and security, variant curation, interpretation, and reinterpretation, personnel qualifications, and conflicts of interest.

After reviewing this information, once the FDA recognizes a database, the information in the repository has to be publicly accessible. The agency will create a publicly available list of recognized variant databases and review them regularly to maintain that status. "Continued transparency about methods and assertions will play a critical role in maintaining confidence in a genetic variant database and thus, maintaining recognition," the agency states in the draft guidance.

Meyer added that in the future, the FDA may exempt NGS tests from premarket review of clinical validity if developers cite data from FDA-recognized public databases. However, maintaining public variant databases will require funding and will depend on researchers and diagnostics developers submitting and continually updating variant data as the evidence evolves.

The agency hasn't said how it plans to incentivize data sharing within public repositories. "The FDA is not requiring that any entity submit data to a database," Meyer said. "However, we hope that labs will submit their data so that a robust and useful clinical data source can be developed and maintained."

Moreover, the agency doesn’t intend to limit its recognition to any single database, she said, but to create a process by which database administrators could demonstrate and be transparent about the reliability of variant associations and interpretations in a repository. "There are currently no FDA-recognized genetic databases," Meyer said. 

It's very difficult for the FDA to provide detailed guidance in the absence of an actual application. Details are only going to become clear once the FDA has an application before it.

Difficult recognition

Based on the process and criteria outlined in the draft guidance, experts in the field expect that Johns Hopkins University's CFTR2 database, which the agency has already used to clear Illumina's 139-variant NGS panel for cystic fibrosis, would likely be able to garner FDA's recognition. But they couldn’t name too many others. 

Heidi Rehm, director of the Laboratory for Molecular Medicine at Partners Healthcare Personalized Medicine, said that a process for achieving FDA recognition of public databases will push the genetic testing industry in a positive direction. The increasing use of NGS testing is revealing a large number of exceedingly rare variants, and a growing group across academia and industry believes the best way to advance knowledge about these variants is to share and aggregate data in public repositories.

Rehm has been urging payors, peer reviewed journals, and the FDA to encourage, or even require, that labs submit variant data into ClinVar, a freely available archive of genotype and phenotype relationships the NIH launched three years ago. Although 542 submitters have contributed variant data to ClinVar, not all of the data in the repository would meet all of FDA's criteria for recognition in the draft guidance. 

The FDA states in the guidance, for example, that databases would need to detail the evidence and scoring system used to interpret variants' association to diseases. ClinVar currently contains more than 200,000 records but only 56 percent of these describe the criteria by which submitters arrived at the clinical significance of the variant, and 10 percent don't have interpretations.

The FDA also proposes to recognize databases that have publicly available standard operating procedures for curation and variant interpretation, including decision matrices based on professional guidelines. ClinVar has curation guidelines for nomenclature, disease ontologies, and clinical significance terminologies that submitters have to follow.

The agency, however, would also like to see "metadata" on the variants featured in a database, including the number of labs reporting that variant, the tests used to detect the variant, and technical information on the tests. ClinVar includes observations and references about variants, but doesn’t currently display patient-specific test information.

Reflecting on the draft guidance, Sherri Bale, managing director of genetic testing firm GeneDx, thinks the agency seems to be conflating the features of a database with the quality of the data in it.

GeneDx, which is part of BioReference Laboratories, has made more than 23,500 variant submissions on nearly 700 genes to ClinVar, the most of any commercial lab, and has developed a process for reviewing variants in its internal database, updating the data, and making regular submissions to the public database. Moreover, GeneDx is working with the Laboratory of Molecular Medicine (LLM), the University of Chicago, and Ambry Genetics to review the classification of 6,000 variants that at least two labs had submitted to ClinVar. 

They've found that the labs agreed on 88 percent of classifications. Of the approximately 724 variants with conflicting classifications, the four labs reviewed 232 and reached a consensus on 86 percent, but couldn't agree on 33 variants. The labs still have to work through nearly 500 variants in this project.

This is precisely the purpose of ClinVar, say those involved in expanding the use of the database. ClinVar is an archival database and doesn't interpret and classify variants, but features the classifications by researchers and labs that submit to the resource. The aim of the database is to bring transparency to the discrepancies in variant classification between labs, so they can work together to resolve differences.

Although there are classification discrepancies and not every record has assertion criteria or interpretations, Rehm has had extensive discussions about the aims and features of ClinVar with the FDA, and understands that a subset of data in the repository could fit the agency's criteria. For example, variants with three stars in ClinVar that have been interpreted by expert panels or variants in practice guidelines with four stars could receive the FDA's nod for regulatory use.

ClinVar currently contains 4,000 unique variation records with three stars and 23 records with four stars. "For three and four-star submitters we collect a lot of documentation about the variant review process," Rehm said. "We could ensure that all criteria that the FDA is asking for is incorporated into our review process to ensure full compliance."

The FDA expects the draft guidance to be finalized early next year, according to experts GenomeWeb spoke to. Rehm said the National Center for Biotechnology Information, where ClinVar is housed, will apply for FDA recognition of the database. She guessed there will likely be a joint application between ClinVar and ClinGen, another program the NIH is hoping to develop into a central resource for genomic knowledge and within which expert groups are curating variants for specific diseases and helping resolve variant classification discrepancies in ClinVar, among other activities.

"It's very difficult for the FDA to provide detailed guidance in the absence of an actual application," Rehm said. "Details are only going to become clear once the FDA has an application before it."

She suspected that not many variant databases would have what it takes to get FDA recognition based on what's in the draft guidance right now, other than for example CFTR2; InSiGHT, a resource for hereditary gastrointestinal tumor mutations; and PharmGKB, a publicly available, curated knowledgebase of genetic variants associated with drug response.

According to Madhuri Hegde, executive director of Emory Genetics Laboratory, many labs in the US, including startups, don't maintain variant databases of the type FDA has described. While EGL and larger labs in the country do have databases with the level of transparency, version control, and documentation that FDA is asking for, many organizations are still using rudimentary methods to track variants, such as excel spreadsheets, she said.

Even if labs wanted to have a database recognized by the agency, "there are going to be institutional restrictions about how much labs can post on a public database," Hegde noted.

EGL, for example, has an internal database, EmVAR, and a public resource, called EmVClass, that features the variants lab directors see in real time. The internal database contains patient-specific and proprietary data that cannot be made public and the public database doesn't contain certain information to protect patients' privacy, such as the frequency of the mutation in the population and the associated phenotype.

Because EGL deals in rare diseases, "we do not want patients to be able to identify themselves," Hegde said, though EmVClass allows healthcare professionals to interact directly with EGL directors when they have questions about variants in the public database. During the 90 days the FDA has given stakeholders to provide feedback on the draft guidances, Hegde said labs should communicate to the agency these privacy-related limitations related to public variant databases.

Who is going to fund these databases with all the criteria FDA is asking for?

Incentivizing submissions

GeneDx's Bale further pointed out that the FDA needs to clarify what it means by "publicly available," a term that is the crux of the draft guidance and central to the agency's definition of a "genetic variant database" it would recognize. While ClinVar is entirely open access, other databases available in the field have various degrees of access. For example, the Human Gene Mutation Database (HGMD), a widely used resource since 1996 containing published data on inherited disease mutations, is freely available for users from academic and nonprofit organizations but commercial users have to purchase a license. 

Similarly, Quest has launched BRCA Share, a database of BRCA1/2 gene variants that's free for researchers, doctors, and patients to access, but for which commercial labs have to pay a yearly fee based on their size. A Quest representative previously estimated small labs would pay around $10,000 annually, while larger labs would have to commit hundreds of thousands of dollars, although the economics would become more favorable as more labs joined the effort. 

GeneDx was quoted a $250,000 fee, according to Bale. Maybe because of the fee structure, a year since launching BRCA Share, Laboratory Corporation of America and Quest are the only commercial labs submitting data. Still, Bale wondered whether the FDA would consider HGMD and BRCA Share public databases.

Pay-for-access schemes aren't merely revenue streams for database owners but are necessary for the continued curation and interpretation of variants in the resource. Labs that have balked at contributing to public databases often cite their funding woes and poor upkeep as reasons. While in its draft guidance, the FDA doesn't discuss how it will incentivize data sharing in public databases, this will be critical to their sustainability and success.

"Who is going to fund these databases with all the criteria FDA is asking for?" wondered Roger Klein, medical director of molecular pathology at Cleveland Clinic. Most of the variant databases are academic operations or governmentally funded. Moreover, the people working on these databases are volunteers. "Particularly in academia, folks would be happy to submit [to a public variant database] but they don't always have the personnel, resources, or time," he said.

ClinVar at NCBI is government funded. JHU's CFTR2 has funding from the National Institute of Digestive, Diabetes and Kidney Diseases and the US Cystic Fibrosis Foundation, including a grant from Sequenom to the foundation.

For a variety of reasons, including the lack of funding and policies on proprietary data, there are plenty of academic and commercial labs not submitting to public databases like ClinVar right now. Even if more organizations were to submit to a public variant database, Klein wondered how willing they would be to back such a resource financially.

"In the long run, the dearth of really well-curated databases and the real limitations on incentives for anybody to go through what the FDA is requesting, present obstacles" to the workability of the agency's proposals, Klein said.

Although the agency doesn't plan to limit its recognition to a single database, what the field needs, according to Bale, is a freely available, central resource for variant information. "The answer is a database that's maintained by the government, that's likely to be available [long term], that's truly public: ClinVar," she said. "That's where our data should be going."

Like GeneDx, LMM and EGL have each contributed data on more than 16,000 variants into ClinVar. Bale wondered, however, who at NCBI will want to tackle the application to garner FDA recognition for the database. 


This is the second installment of a two-part story on FDA's draft guidances on next-generation sequencing tests. The first installment focused on industry reactions to the recommendations, mainly the draft guidelines for demonstrating analytical validity of germline NGS tests.