The US Food and Drug Administration is rolling up its sleeves and diving elbow-deep into the messy world of microarray data as it girds for a potential wave of genomics-derived information submitted as part of the drug approval process.
At a June 10 meeting of the FDA’s pharmacology/toxicology subcommittee on pharmacogenomics, the agency took its first steps toward tackling the questions of whether, when, and how it will accept data from microarray experiments submitted under investigational new drug (INDs) or new drug applications (NDAs).
“[FDA reviewers] haven’t seen microarray data,” said William Matthes, associate director at Pfizer’s Kalamazoo Genomics Center of Excellence, who attended the subcommittee meeting. “They’re concerned about how to handle it, they’re concerned about its analysis, they basically are concerned about what does it look like, what does one do with it?”
According to another attendee, Kurt Jarnagin, who is vice president of biological sciences and chemical genomics at Iconix Pharmaceuticals, “The agency believes that in time, [microarray data] will become a standard part of any submission, either IND or NDA, and it needs to prepare itself for that day. They can’t suddenly start getting inundated with data and expect to respond to that.”
As a first step in familiarizing itself with the nuances of microarray data, the FDA’s Office of Testing and Research has embarked upon two separate gene expression database projects. One, in collaboration with Iconix, will introduce FDA reviewers to the basics of microarray data via the company’s DrugMatrix toxicogenomics database. A second project, with Schering-Plough and Affymetrix services provider Expression Analysis, will create an internal “mock submission” database for gene expression data.
The outcome of the two database projects will shape a draft guidance document the FDA is preparing on the submission of microarray data. BioInform’s sister publication, BioArray News, reported last week that Janet Woodcock, director of the agency’s Center for Drug Evaluation and Research, expects the draft guidance to be prepared by August.
Getting Past the Fear Factor
The FDA has had access to the DrugMatrix database since March, when it began a collaboration with Iconix to gain hands-on experience with toxicogenomics data and tools. The agency is boning up on the database as part of an effort to correlate the content and format of gene expression microarray data with standard toxicology and pharmacology study results. Iconix is training FDA reviewers on quality control and quality assurance for microarray data generation, as well as the analysis of data across multiple microarray product platforms, and the validation of biomarkers from integrated chemogenomic datasets.
The database contains findings from approximately 600 compounds, across multiple doses and multiple times. Gene expression data is linked to information on pharmacology, histopathology, clinical chemistry, and toxicology related to those compounds, to provide a “contextual reference set” for FDA reviewers to compare new findings with known results, Jarnagin said. “It gives the opportunity to ask specific questions,” he added. “For example, ‘Is it the case that the change of any oncogene can cause cancer?’ You can look at the database and see that that’s not true. There are dozens of drugs that elevate [expression of] one or several oncogenes, yet have been used in patients for years and years with no evidence of additional oncology.”
The FDA’s use of such a reference database for the evaluation of gene expression data could alleviate much of industry’s lingering anxiety about submitting genomics-driven data, Jarnagin suggested. “There’s a lack of trust that FDA will actually respond to the whole picture, and not respond to one gene,” Jarnagin said. Drug companies “have to get over the fear factor that ‘If I submit this big experiment, some gene’s going to change that the FDA’s going to go nuts about and kill my compound.’”
Bringing it in House
The goals of the planned internal gene expression database are a bit different than FDA’s project with Iconix. In this effort, FDA, Expression Analysis, and Schering-Plough will build a framework to support the “mock submission” of data from a drug project Schering opted to discontinue. “We’re taking that data, which includes microarray data, histology data, clinical chemistry data, and phenotype data, and helping FDA to understand the appropriate format, content, and context of microarray-based submissions,” said Steve McPhail, CEO of Expression Analysis.
Pilot submission to the database is expected to begin in June, and the project is scheduled for completion in October, McPhail said. A final summary report on the project is planned for November.
The project will address a laundry list of issues, including laboratory infrastructure, sample processing and array QC/QA issues, and experimental design and replication, but informatics-related questions make up the majority of topics. Data management issues such as format and file structures, linkage mechanisms between microarray data and other datasets, statistical analysis systems and software, and inference and modeling methods will all be examined as part of the project, McPhail said.
Expression Analysis will use Affy’s MAS 5.0 software to analyze the data, but “we may use other methods as well,” McPhail said. While the company has two years’ experience processing Affymetrix data, “the linkage mechanisms are not something we’ve worked on in the past,” he noted, so Expression Analysis is turning to its sister company, regulatory informatics firm Constella Group, to handle the integration between microarray data and other clinical information.
Initially, the project will follow CDER’s current guidance recommendations for regulatory submissions in electronic format, with the goal of identifying areas that need to be modified or redefined. This guidance stipulates that datasets be submitted as a SAS transport file of less than 25 MB per file, with data variable names of no more than eight characters, data elements defined in data definition tables, and variable names and codes consistent across studies.
The submitted array data will include raw data files after image analysis. In addition, a summary report will be provided to describe normalization, data processing, and statistical analysis steps. It is expected that these guidelines will be extended to improve compatibility with microarray data as the project progresses.
The FDA’s database activities are not without precedent. A project spearheaded by the International Life Sciences Institute consortium and the European Bioinformatics Institute has been developing a centralized, public gene expression database for over a year. It is built on the EBI’s ArrayExpress gene expression database, with the intention of linking toxicogenomics data from multiple platforms. Data input is currently ongoing, and the complete database is expected to come online by the first quarter of 2004.
“The intent of the ILSI effort was to establish some public offering that could be helpful in developing standards,” said Pfizer’s Mattes, who is on the ILSI database working group. Building on the MIAME (minimum information about a microarray experiment) guidelines, the ILSI/EBI project has drafted a revised version of the standard called MIAME/Tox that aims to establish some consensus on the minimal descriptors for array-based toxicogenomics experiments (available at http://www.ilsi.org/committees/hesi/genomics/MIAME1.1ToxCircDRAFT-rev3.DOC).
Judging by the near-universal acceptance of the MIAME standard in the microarray world, it’s likely that MIAME/Tox will gain broad support within the toxicogenomics community. However, it is still in draft form, and has not been endorsed by anyone yet, least of all the FDA. CDER’s Office of Information Management coordinates all of its standardization efforts, but according to Mattes, “there needs to be some communication between that group and anything going on in terms of a toxicogenomics database.”
Indeed, the reigning CDISC-based guidance at CDER poses a number of differences from the proposed MIAME/Tox standard. MIAME/Tox proposes a more restrictive vocabulary, for example, with a field proposed for each clinical chemistry test. MIAME/Tox also collects information on in vitro experiments, while the standing CDER guidelines don’t require it, and MIAME/Tox does not collect information on drug plasma levels, whereas this is currently done under the CDER guidelines.
But MIAME — along with its accompanying data format, MAGE — is only the first piece in a much larger set of standards that need to be developed for a fully functional toxicogenomics data submission platform. In addition to a dearth of standards for experimental design, normalization, and a “universal” RNA, “there is no standard yet for analysis,” said Mattes. “So, if somebody says, ‘I’ve identified the regulated transcripts after this particular treatment,’ what’s the best way [to verify that analysis]? It’s a huge question.”
While the ILSI database project initially set out to address these standardization issues, Mattes said the group is far from a solution. “We have discussed and compared analysis, but resolved them? That’s a definite no,” he said.
The Risks of Risk Assessment
The ILSI/EBI group has made some headway into the very issues that FDA plans to address with its own database, but there has been no formal involvement between the two groups so far, Mattes said. However, he added, “This may be the time for it. I’m sure, coming out of the [subcommittee] meeting, it would be a time when FDA would be interested in doing that, and I know we would be too,” he said.
FDA could save itself some duplication of effort — and perhaps a lot of heartache — by communicating with the ILSI group. While the goals of the two projects are slightly different, the ILSI project did set out with the intention of creating a mock submission database for regulatory-bound gene expression data. However, Mattes said, after a bit of discussion on the subject, “we decided that the data we had developed was not appropriate for a mock submission.”
Why was it unacceptable? “It didn’t address risk assessment,” Mattes said — a point that will likely impact the FDA’s own database effort. The question of risk assessment lies at the crux of the entire regulatory process, and is one that the agency has yet to address regarding microarray data, Mattes said. So far, “genomics has been used in a predictive role, in the sense that genomics data from a short-term animal study or an in vitro study is used to anticipate longer-term treatment adverse events,” he said. “In that case, the issue isn’t risk assessment, the issue is prediction.”
Microarray data may help flag potential problems in regulated assays, Mattes said, “but it has never substituted for a standard, regulated study, and I don’t think anyone has anticipated that it would. In which case, then, you ask yourself, ‘If I’m going to submit all my standard, regulated studies anyway, why would I need to submit genomics data?”
The lack of standards in the field is another sticking point, Mattes said. “It obviously gets in the way of the FDA saying, ‘Submit this data.’ If we can’t say what the quality factors are for this data, and how we can analyze it, it’s too soon to submit it.”
Despite his doubts, Mattes is following the “just in case” strategy of many of his colleagues — and the FDA itself — in crafting the means by which microarray data may be submitted to the agency if and when it becomes a requirement. Though the project may still be a bit “premature,” Mattes said, “working through the mock submission is a way to enlighten the agency, and enlighten the sponsors, on some of the issues that we’ve got to confront.” Jarnagin reiterated the ambivalence in the community over the issue. “On the question of whether the agency should encourage submission, and whether the agency should prepare itself to accept submission, I saw unanimity among the [subcommittee] panel, and the answer was yes,” he said. “As to whether this becomes part of the regulatory decision today, it seemed that the gestalt of the panel was probably not today; but looking out into the future, at some point it will probably become more common. Whether it will become routine, I didn’t see unanimity in the panel and I don’t have unanimity in my mind either.”
FDA’s decision to get its hands dirty and grapple with microarray data first-hand may be the impetus that drives genomics data into the regulatory process. A number of circular arguments proliferate in the field around the fact that industry isn’t submitting data to the FDA yet precisely because the agency is unfamiliar with it. With the FDA now taking an interest in the standard-setting process, “it certainly benefits the entire scientific community to move forward with standards in normalization and analysis,” Mattes said. “And that gets to the point that you almost need that in place before you can answer the question of how this data can be used in a risk-assessment standpoint.”
Mattes added, however, that a great deal of work remains before the FDA is able to determine when and how microarray data should be included in the regulatory process. “I think we’re trying to work out the nuts and bolts before we get there,” he said.
McPhail agreed that it is too soon to jump to any conclusions regarding the future of microarray data in the regulatory process. “The agency is just trying to get [its] arms around format, content, and context at this point and time, so it’s probably too soon to tell what impact this will have on the future of microarray testing in support of INDs and NDAs,” he said.