WASHINGTON - The US Food and Drug Administration is eager to flesh out its pharmacogenomics guidance with technical details for submitting genomic data, but a number of informatics issues are proving difficult to resolve, according to FDA officials who spoke at a workshop here this week.
The agency recently released a draft “concept paper” as a first step toward an attachment it is planning for the pharmacogenomics guidance document it issued last year. The concept paper, “Recommendations for the Submission and Review of Genomic Data,” provides guidelines on a number of technical issues, such as sample preparation, labeling systems, hybridization protocols, and reporting formats.
The paper, available here, addresses four main topics: gene expression microarray data, genotyping, genomic data in clinical study reports, and genomic data from nonclinical toxicology studies. But it does not touch upon statistical methods for normalization or gene list selection, classification algorithms, or specific data standards that might be used in generating this information.
Federico Goodsaid, senior staff scientist in the genomics group at FDA’s Office of Clinical Pharmacology, said that those details were omitted by design, because there are still too many unresolved questions related to statistics, algorithms, and data standards in genomics.
Goodsaid spoke during a workshop here this week to discuss key issues that the proposed guidance attachment should address. The two-day workshop, “Best Practices and Development of Standards for the Submission of Genomic Data to the FDA,” was co-sponsored by the FDA, the Drug Information Association, the Pharmaceutical Research and Manufacturers of America, and the Biotechnology Industry Organization.
The agency plans to gather feedback on the draft concept paper from workshop attendees and the broader community over the next year and will take that input into consideration for the final attachment to the pharmacogenomics guidance.
Goodsaid told BioInform that the concept paper covers technical areas in which the FDA felt there was some “consensus” in the community and that the agency deliberately steered away from more controversial topics, which would likely be addressed in a separate document.
The goal of the workshop, he said, was to “capture consensus where it exists, and where it doesn’t, to promote discussion.”
The P-Value vs. Fold Change Controversy
If the intensity of discussion was any indication, the workshop was a success. One topic that spurred a considerable amount of debate was based around recently published data from the Microarray Quality Control Consortium, which indicated that the oft-reported lack of reproducibility between microarray experiments may have nothing to do with the technology itself, and everything to do with the statistical methods used to select lists of differentially expressed genes.
In particular, the MAQC study found that poor reproducibility may be due to the common practice of ranking genes by a statistical significance measure, such as a stringent P-value cutoff. When fold change was used as the ranking criterion instead of P-value, the gene lists became much more reproducible.
“It’s not the platforms,” said Leming Shi, a computational chemist at the FDA’s National Center for Toxicological Research who led the MAQC effort. “Even the same data set gives different gene lists when subject to different statistical analysis.”
This finding runs counter to recent trends in microarray analysis toward more statistically rigorous analytical methods. Roderick Jensen, director of the Center for Environmental Health, Science, and Technology at the University of Massachusetts, Boston, introduced a panel on the subject by noting that the MAQC results changed his “worldview” about microarray analysis, while Russ Wolfinger, director of scientific discovery and genomics at SAS, said that the MAQC findings initially “infuriated” statisticians, who have been pressing the microarray community for years to rely more on statistical methods.
Wendell Jones, senior manager of bioinformatics and statistics at Expression Analysis, used simulated data to show that the best approach appears to be a combination of fold change and a non-stringent P-value cutoff. This provides the specificity and sensitivity of the statistical significance measure with the reproducibility of the fold-change cutoff.
Wolfinger described reproducibility as an important “third dimension” in addition to sensitivity and specificity. Statisticians prefer the latter, he said, while biologists and chemists are more comfortable with the former. “The weight attached to each dimension depends on the goal of the study,” he said.
The question now for the microarray analysis community — and the FDA — is how to determine guidelines for weighting those measurements properly.
FDA’s Shi noted that the finding would spur future discussions on “how to balance fold change and P-value,” but stressed that the MAQC was intended as a research project, not a regulatory project, and the agency does not plan to issue recommendations on the matter.
Nevertheless, it was clear that the issue is of some concern to FDA officials. Felix Frueh, associate director of genomics at the FDA, said that the MAQC findings regarding statistical analysis were “very critical,” and added that there have been “very intense discussions at the FDA about this.”
FDA’s Goodsaid noted that the microarray community must sort out its statistical issues if the technology is to find its way into the clinic. “Say you’ve got 20 genes that are to be used in a clinical trial,” he said. “You’ve only got one chance. If that has to be repeated, people will start not liking genomics.”
Goodsaid added, “That’s not a regulatory worry. That’s a worry about the field of genomics. If there are statistical limitations, then we need to think about how we validate those [gene] lists.”
In addition to the controversy surrounding ranking criteria, there are also unanswered questions regarding the best methods for normalizing microarray data. Robert Delongchamp, a mathematical statistician at the FDA’s NCTR, called normalization the “Achilles’ heel” of microarray technology. “You have to make an assumption and get people to believe you,” he said.
“Even the same data set gives different gene lists when subject to different statistical analysis.”
“There is no consensus on how to normalize the data,” he said during a panel discussing “best practices” for submitting genomic data to the FDA. He acknowledged that statistical considerations fell a bit outside the scope of the panel because best practices in that area have yet to be identified. “There is a lack of consensus when it comes to analyzing [gene-expression] data,” he said, adding that this uncertainty was responsible for the “lack of detail” regarding statistical methods in the FDA concept paper.
Classification methods represent another controversial area in the field. The MAQC intends to address this subject in the second phase of the project, which officially kicked off this week.
Standards Are on the Way
Another hot topic at the workshop was the lack of standards for formatting and submitting genomic data to the FDA, but it appears that this problem may be resolved a bit sooner than those surrounding analytical methods.
Representatives from the Clinical Data Interchange Standards Consortium and Health Level 7 — the leading standards bodies for clinical and healthcare data, respectively — provided an update on efforts underway to bring genomics data into the fold of existing data standards.
Edward Helton, chief scientist of regulatory and biomedical affairs at SAS, said that CDISC is currently developing a “pharmacogenomics domain” that will enable drugmakers to submit genomic data as part of CDISC’s Study Data Tabulation Model format.
HL7, meanwhile, has several genomics-related activities underway. Phil Pochon of Covance Laboratory said that the organization has already passed “draft” standards for single-gene, multiple-gene, and pharmacogenomics message elements. In June, HL7 also began piloting a clinical genomics standard for use in genetic counseling.
Pochon said that the draft standards are intended for use in pilot projects, and that HL7 is seeking partners who might be interested in conducting such pilots.
In the absence of clear standards, the FDA is forced to work with whatever it gets. Michael Orr, senior staff scientists in the CDER Office of Clinical Pharmacology at FDA, said that most of the data the FDA has received under its voluntary genomic data submissions initiative “has not been in electronic format.” Frueh said that the FDA’s lack of guidelines regarding data formats led to some unexpected outcomes in the VGDS program. “It was as bad as microarray data submitted in PDF files,” he said.
Orr explained that data that does come in the correct format, such as Affymetrix CEL files, is deposited in the FDA’s ArrayTrack database, where FDA reviewers analyze it with a number of in-house and commercial tools, including Rosetta Resolver and Ingenuity Pathway Analysis, and against reference databases from Gene Logic and Iconix.
The system isn’t perfect. During a panel discussion during the workshop, representatives from several drugmakers expressed concern that FDA researchers reanalyzed their genomic data using different methods and reference databases than they did, which led to different results. In most cases, FDA and sponsors ultimately reached the same biological interpretations of the data, but it did raise a few red flags for some pharma representatives.
Brian Spear, director of pharmacogenetics at Abbott, noted that FDA researchers reanalyzed Abbott’s data — gene expression microarray files related to a preclinical toxicogenomics study — “independent of the study design,” which led them to “draw conclusions beyond our questions.” He also said that the FDA reviewers exhibited an “unavoidable desire to look at individual genes,” which was beyond the scope of Abbott’s submission.
He added that the FDA was using a different version of Rosetta Resolver than Abbott used, which also led to slightly different results.
Nevertheless, although the FDA researchers and Abbott did not identify the same gene sets, they did reach “identical conclusions with regard to the biological interpretation” of the data, Spear said. He added that he was impressed with the FDA team’s analytical capabilities. Its assessment of the data set was “brilliant,” he said, “and not just because they agreed with us.”
Orr said that the FDA is continuing to improve its IT infrastructure for handling genomics data, but noted that the nascent state of the field makes long-term planning difficult. “We’re still not certain what types of tools, interfaces, and report-generation tools will be required for reviewers to work with pharmacogenomics information,” he said.