SILVER SPRING, Maryland – At a two-day workshop held here last week to discuss regulatory standards for next-generation sequencing tests, industry stakeholders, academicians, and researchers considered whether a prescriptive framework or more general guidelines are needed to establish the analytical validity of such tests, and how curated databases might be used to gather information on the clinical validity of the assessed markers.
The US Food and Drug Administration has been gathering public input on how best to regulate NGS, since the agency's traditional regulatory framework isn't equipped to quickly oversee the technical complexity and wide range of markers gauged by such tests. Based on discussions at an earlier FDA workshop this year, the agency began mulling analytical validity standards for NGS, and following its clearance in 2013 of Illumina's MiSeqDx NGS platform for cystic fibrosis using Johns Hopkins University's CFTR2 database, it seemed open to considering how labs might use similarly well characterized variant repositories to demonstrate clinical validity.
Although the workshops didn't yield any specific regulatory policies, the discussions kept returning to certain themes, such as data sharing, transparency, and public communication of test limitations. These issues have likely been percolating among industry stakeholders amid the recent criticism Theranos has faced for not publishing evidence underlying its blood tests.
Robert Califf, nominee for the FDA commissioner post, characterized NGS as an area of great excitement but also of "tremendous uncertainty because the science is advancing so fast." Establishing a regulatory framework for NGS is part of FDA's charge under the Precision Medicine Initiative, which aims to build a 1-million-patient cohort to advance research. "I think [this] will be a 10- or 15-million-patient cohort by the time we're done," Califf said.
Although precision medicine has great potential to improve the health of Americans, "the basis for getting there is having tests that give us accurate and useful information," Califf said. In the past this was a comparatively simple proposition, but NGS has made regulation more complex. FDA's job is to ensure the safety and effectiveness of tests, but also encourage innovation, and to achieve this Califf said that good regulation has to adapt to the advances in technology.
"You don't have to work here for too many days before you begin to realize the degree to which the American public depends on the FDA and would prefer not to have to think about it every day but just assume that things are being taken care. So, when they have a test or get a treatment or they eat something, it's safe," he said. "Innovation for its own sake is one of the critical American values, but it's not enough. If we're going to benefit patients, if we're going to save lives and reduce suffering, the new tests and drugs have to be supported by sound science."
Support for a hybrid approach
Since the public workshop in February, the FDA has been gathering ideas for regulating NGS tests, which has proven challenging for its traditional oversight framework since the markers that are gauged aren't predefined as in traditional tests. At the earlier meeting, industry players seemed to favor a more flexible design concept standard to establish analytical validity, as opposed to more prescriptive performance standards.
Under either framework, the test developer would start with a specific intended use. However, under design concept, the test developer follows certain "principles" or critical factors as its designing the tests. Following a kind of recipe, test developers should be able to consistently produce high-quality NGS tests. Meanwhile, the performance standard mechanism would establish specific metrics for accuracy, precision, sensitivity and specificity; require certain protocols and studies; and define thresholds for measurement.
"There is really no one size fits all," Birgit Funke from Harvard Partners said during the first day of the workshop, noting as an example that depending on whether the test is gauging germline or somatic variants, the analytical requirements will vary. Although genomic testing firms are using both design concept and performance standards, the latter is more difficult to generalize, she said.
On the other hand, John Pfeifer from Washington University School of Medicine likened the College of American Pathologists' process-centric checklists for establishing the analytically validity of NGS tests to that of a recipe for baking cake, at the end of which one can't be sure if the end product will be a cake or a lump of flour on a plate. "There is so much variability in the next-generation sequencing platforms, test designs, and bioinformatics pipelines, that we have to understand that the process-based approach is … in fact inadequate," Pfeifer said. "We're really at a point now because there is so much variability that we need standards."
Indeed the analytical performance of NGS tests can differ dramatically based on variant type, allele frequencies, specimen types, and a range of other factors. Diagnostic labs performing NGS have internal processes for managing this variability. For example, Geoff Otto from Foundation Medicine pointed out that his company has developed different platforms to gauge specific types of variations, such as indels, copy number changes, and genomic rearrangements. "We have different metrics for each of those," he said.
Based on these and other comments, FDA officials gathered that the workshop attendees were asking the agency to consider a hybrid of the design concept and performance standards approaches. Girish Putcha from Medicare contractor Palmetto highlighted that within the MolDx program, his group had taken a hybrid approach to advancing analytical standards for comprehensive genomic panels for metastatic non-small cell lung cancer by combining NGS-specific requirements, third-party lab reviews, and proficiency testing for reported variant classes.
Some workshop attendees talked about establishing minimum coverage specification based on the type of test performed, for example for cancer or an inherited condition. Others asked FDA to focus on metrics for accuracy.
There were lengthy discussions on the need for reference materials for validation and proficiency testing, including well-characterized patient samples, in silico data sets, and genetically engineered samples that have a range of variants. While several speakers noted that in silico datasets are helpful when there aren't sufficient patient samples, most maintained that there is no substitute for having real samples.
However, there is a lack of funding for developing much-needed reference materials, everyone agreed. "To characterize these reference materials is much more expensive than laboratory validation because you really need to sequence them with more than one sequencing technology," observed Deanna Church from Personalis. "You need a sophisticated adjudication mechanism for resolving differences and a lot of the analysis going on right now is being done by post docs and grad students."
Commercial entities such as Horizon Discovery are developing reference materials, but such resources add R&D cost, Putcha reflected. "The problem comes back to reimbursement … and what you actually get paid effectively for all of this R&D work," he said. "Realistically, it seems like the market might have to create the incentive to actually do this, but payors also have to acknowledge that this becomes part and parcel of how you get a test to the market and how you keep it available."
Elizabeth Mansfield, director of personalized medicine at FDA's Office of In Vitro Diagnostics and Radiological Health, recognized the funding gap for reference materials. "This is a rate-limiting step in the development of next-generation sequencing as a very strong clinical application," she said.
Incentivizing data sharing
The FDA is interested in using curated variant databases, such as JHU's CFTR2, "as a source of clinical validation evidence" for markers gauged by NGS tests, Mansfield said during the workshops. If variant databases can become a "reliable and believable source of data," test makers don't have to go out and find patients with a specific variant and document their phenotype and their response to drugs, she explained.
Within the databases themselves, the FDA is mulling whether variant/disease associations used for regulatory purposes will require statistical significance. Given the rarity of many variants identified by NGS, showing statistical significance will be challenging. At a meeting in March next year, the FDA will gather input on this topic. "We're not trying to drive databases to some level of fantastic significance," Mansfield said during the second day of the workshop. "What we're trying to do is aggregate information so individual observations all come together and help us make better decisions when we see the same thing again."
"We'd like to be able to have a little bit more of a hands-off [approach and], put away the magnifying glass," she added.
There will be instances where a lab will observe a variant that's not in a regulatory-grade database, or a lab's internal database shows a marker to be a VUS while the repository FDA is using identifies it as benign. In such cases, labs could use their own information for regulatory purposes but would have to justify it with other evidence, FDA officials told workshop attendees.
Stakeholders in the life sciences field are already developing shared variant databases and advancing common criteria in terms of interpretations and nomenclature. For example, the American College of Medical Genetics and Genomics, Association for Molecular Pathology, and CAP have jointly developed a variant classification system and standard terminology that labs and clinical geneticists can use to determine if a genetic variant identified in a patient is associated with disease.
Meanwhile, ClinVar is the National Center for Biotechnology Information's archival database of variants, where labs have made 170,000 submissions on more than 130,000 variants. ClinVar doesn't curate variant information, but does ask submitters to describe their classification methodology.
Before databases like ClinVar and others can be used for regulatory purposes however, the field must address a number of challenges. For example, there is no uniform nomenclature for describing variants in databases, although NCBI is thinking about developing an allele registry that would map between different variant descriptors and issue a single identifier. Also, meeting participants emphasized the importance of clear and transparent standard operating procedures for maintaining and updating databases and the need for formalized training opportunities for those curating variant data.
Historically, limited funding has stood in the way of keeping public variant databases accurate and up to date. But maintaining databases is time- and resource-intensive. For example, GeneDx's Sherri Bale said that her lab has made close to 24,000 variant submissions into ClinVar, but as many as 10 full-time lab professionals are working on this effort with only "a tiny bit of money" from ClinGen.
"We need to find ways to incentivize this," Mansfield acknowledged. "So, Girish [Putcha from Palmetto], I look to you," she quipped, a tacit acknowledgment of increasing demands from the life sciences sector that payors must reimburse labs for submitting information to databases. Putcha said at the meeting that Palmetto would be interested in hearing from the life science community about how it could "help facilitate the development of such databases through coverage and reimbursement policies."
Experts had different ideas, however, about what aspects of database development regulators and payors should focus on for regulatory purposes. For example, Heidi Rehm from Partners HealthCare Personalized Medicine pushed for incentivizing data sharing.
Around 13 percent of variants in ClinVar have divergent interpretations between labs, she said. However, when four labs that have submitted one-third of variants (around 35,000 variants) in ClinVar came together to share data and discuss interpretation discrepancies for 115 variants, they were able to resolve interpretation differences 71 percent of the time.
"I would argue that from the data I've shown you and the experience we've had, the benefits of data sharing and comparing variant interpretations are exceedingly clear and are necessary for patient safety," she said. Rehm asserted that journals should require data sharing for publications; lab accreditation organization should require variant interpretation information for quality control purposes; doctors should order tests from labs sharing data; and insurers should reimburse labs for this activity. Finally, "FDA should consider tests from labs that do not share their interpretations to be considered higher risk, because that data is not subject to peer review," she said.
In contrast, Julie Eggington from 23andMe didn't find much value in currently available public variant databases due the discrepancies between them. "So far, public databases are not yet successful for variant classification when used in isolation," said Eggington, who from 2009 to 2013 was a clinical variant specialist at Myriad Genetics.
Myriad houses the largest proprietary variant database on BRCA1 and BRCA2 genes. Eggington and others from Myriad published a study in the Journal of Community Genetics earlier this year comparing variant classifications across five databases — Breast Cancer Information Core; the Leiden Open Variation Database 2.0 in the Netherlands; ClinVar; Inserm's UMD; and the Human Gene Mutation Database in Cardiff. Of 2,017 analyzed BRCA variants, 116 were identified as pathogenic in at least one database, but all the databases agreed on the classification of only four variants. Meanwhile, 34 percent of the mutations that Myriad identified using its own database didn't show up in any of these other repositories.
Based on this and other data, Eggington argued that to date only "private databases have been proven successful." What's ultimately critical is that labs publish data on their variant classifications, she said, since databases aren't themselves verifiable without associated information in the literature. Furthermore, if the FDA wants to start regulating based on information in variant repositories, Eggington suggested the agency audit internal, private databases.
Still, the FDA seemed more than willing to support the development of public variant repositories like ClinVar. "We heard from many [at the workshop] that we need to keep ClinVar alive," Mansfield said. "Whether you love it or hate it, it is a great resource. It already has a lot of information in it and it can be a centralized place in the US to start trying to combine all the data that may be spread out all over the place now."
This idea of publishing validation data and being transparent about the underlying evidence for a test was a theme that spanned the two-day workshop. Some attendees complained that when healthcare providers try to get information from labs about test validation, they're often told this is proprietary information.
Foundation Medicine's Otto noted that publishing data on the test's capabilities is critical from a patient safety perspective. "We see a lot of patients with cancer. We're going to get one shot at the genomic profile and depending on what that assay does or doesn't do you're going to give them a radically different standard of care," he said.
Mansfield observed that there was "an enormous amount of concern" during the two-day workshop about the lack of transparency in test validity information, and even with regard to databases. She indicated that the FDA will take these concerns into consideration as it advances regulatory policies.
"The limitations of the test should be part of the transparency," Mansfield said. "That's something we at the FDA are very familiar with now in terms of what a test can do and what it can't do. And we strongly agree that limitations need to be publicly expressed."