As drug-discovery firms look to use biomarker data as part of their drug submissions, and the US Food and Drug Administration prepares to consider ever more data from omics technologies, pharmas are struggling to identify the proper standards to ensure that data from their preclinical research is integrated with clinical trials data.
This challenge was underscored at “A Call to Arms — Data Standards in Collaborative Research,” a conference hosted by the Massachusetts Technology Leadership Council earlier this month in Waltham, Mass., where participants highlighted the commonalities and differences between the clinical and pre-clinical research spaces, panel moderator Joseph Cerro told BioInform in an interview. Cerro is president and founder of the Schooner Group, an informatics consulting firm.
Cerro said that meeting participants discussed a range of issues, including the need to move the standards discussion beyond the issue of interchangeable file formats to include broader aspects such as experimental conditions, and identifying “common ways to describe experiments [and] experimental conditions regardless of the type of raw data underlying it.”
Other issues of interest to the pharma representatives in attendance included keeping tabs on regulation, how existing standards in the clinical realm will affect pre-clinical areas, and how to manage collaborations with outside partners in academia or with contract research organizations to ensure data portability.
As Cerro explained, harmonizing standards at the intersection between clinical and basic research is not only about file formats, but about “agreeing [on] what is the nature of the data and what’s the meaning behind it and how can you re-use it in different contexts.”
Rather than debating the pros and cons of any particular standard, he said, it may be better to “take that off the table as an item of contention” and “make sure there is a translator in place whenever you move data from one place to another.”
Clinical standards such as Health Level 7, or HL7, and the Clinical Data Interchange Standards Consortium, or CDISC, have wide adoption in the clinical realm, but there is no equivalent in basic research. “The clinical space is much more heavily regulated,” Cerro said. “If the FDA or Medicare … says, ‘We want data in this format,’ then in some way, we will start working with that format.”
Crossover points between basic research and clinical areas are emerging, he said, for example, in pharmacokinetics or toxicogenomics where scientists are working on questions for which FDA guidelines are already in place.
“You are likely to see what areas of research might start to organize themselves a little bit more quickly and [those might] be the areas that would maybe consider trying to standardize first, the areas that are closest to the clinical realm,” he said.
As new technologies emerge for often very specialized applications, “there is not as much critical mass around any particular space within preclinical research to plant a flag [and say,] ‘Let’s rally the troops and get data standards,’” Cerro said.
When he consults for biopharma clients, whom he declined to name, Cerro said he looks to see if a firm has standardized protocols across research sites. “Without splitting the hairs on experimental protocols, I really take a good look at the process. … It is easy when you are working with sophisticated technologies sometimes to lose sight of the very basic quality-control, quality-assurance steps,” he said.
He also said that large companies aiming for data portability don’t want data to “to be controlled or locked in to any particular vendor.” Companies such as Bayer or Merck have power in the marketplace to set standards, he noted.
“If [drug makers are] providing bioinformatics information that needs to go into a submission, whether it’s an [Investigational New Drug] or a [new drug] application, it’s going to have to be in a format that’s acceptable to FDA.”
Another MassTLC panelist, Michele Pontinen, who leads the R&D practice at business consulting firm Capgemini, told BioInform that the standards approved and accepted by FDA for electronic submissions are moving beyond the realm of clinical trials. For example, she noted a CDISC module called Standard for Exchange of Nonclinical Data, or SEND, “that reaches back into pre-clinical, so it’s going to get into the researchers’ area.”
Eventually, she expects CDISC standards will be set up for biomarkers and genomics data, she said.
Pontinen worked as a bioinformaticist at the National Institute of Allergy and Infectious Diseases and GlaxoSmithKline before joining Capgemin. She also helped start up Astra Merck, the joint venture between AstraZeneca and Merck, and worked for PricewaterhouseCoopers and Booz Allen Hamilton.
Once FDA “turns the key” on its electronic repository and is able to accept all submissions electronically, then standards for electronic submissions will be mandated, though she noted that “there’s no set time frame” because FDA is “still building some of the tools they will need to look at the electronic data.”
For pharma researchers wondering if this standards shift applies to them, “if they’re providing bioinformatics information that needs to go into a submission, whether it’s an [Investigational New Drug] or a [new drug] application, it’s going to have to be in a format that’s acceptable to FDA,” she said. “It’s not going to be that they can pick and choose whether they are going to do this.”
Some of her clients, she said, are “extremely worried about this” and therefore already moving to migrate their submissions into CDISC standards and readying their submissions for SEND. She declined to mention her clients’ names. “They know if they are going to use genomics data or any type of biomarkers data to move toward a targeted treatment group, they are going to have to submit this,” she said.
Other companies are not as concerned about the timeline for introduction of these standards or are unable to react. “What I’ve seen with some clients is paralysis,” she said. “It’s deer in the headlights stuff,” adding that she advises the firms not to wait, since data is accumulating at an ever faster pace. She offers different scenarios depending on “their level of pain,” helping them see opportunities that are not necessarily “that expensive but can position them better down the line.”
These companies are looking for “flexible” IT systems that can support the changing workflow, she said. That means some firms will be looking to outsource their research IT.
Companies looking to avoid late-stage clinical failures are using bioinformatics tools to analyze clinical trials data to incorporate information on adverse events [are] and basing their decisions on “information they have about the compound back in research,” she said.
Citing her experience with standards discussions in developing facets of the National Cancer Institute’s Cancer Biomedical Informatics Grid, Pontinen acknowledged that the research community has had a difficult time reaching consensus on data standards. However, she noted that pharmaceutical firms are looking for public projects like caBIG to drive standardization because their internal funds for data conversion are limited.
She said that the CDISC model may show promise in that it “brought FDA in.”
The Pharmaceutical Research and Manufacturers of America “got behind CDISC and said, ‘Our members would like you, FDA, to accept electronic submissions using these standards.’ And so FDA said, ‘OK, if that is what you want us to use, that will be acceptable … but just remember you are the one who wanted this,” she said.
“I think the same thing is going to have to happen with the balance of bioinformatics standards that we are all struggling with right now,” she added.
Find it, Quickly
Being able to access data quickly can make a big difference in the way companies react to queries from FDA. One client recently confessed that when a query comes in, “it takes us forever — not only in the development IT but going back to the pre-clinical IT environment to find the information,” Pontinen said.
The information is in silos “which do not speak to each other,” and may have been captured in different formats, she said.
Cerro noted that large pharmas are struggling to synchronize their data standards across multi-site facilities. “It’s been very rare to do this by fiat across a large company, [mandating] ‘everybody will use this standard,’” he said. “That’s been tried, but it usually takes a little more consensus-building to come up with good data-sharing within an institution.”
While some pharmas, such as Novartis, have turned to semantic web technology to address this problem, Cerro said he’s not quite sold on the promise of that approach. “I am starting to see cases where it is helping. I wouldn’t go so far as to say it has been transformative yet,” he said.
Cerro said that the “silo” problem isn’t restricted to a drug maker’s internal research groups, but impacts all of its collaborations, whether they are with academic groups, biotechs, or other pharmas.
In order to make those collaborations successful and efficient, he said, “organizations have to take a good hard look at how do they collaborate. What kind of collaboration portals, research portals, do you set up between organizations? Are you going to do a one-off portal for every single collaboration that you engage in, or are you going to try to standardize on certain types? Basically what are the best practices in that space?”
Cerro said that heads of informatics and research computing at large biopharmas are “speaking with each other and trying to agree on some areas where they can agree on some standard approaches to make things easier for the vendors and for the smaller collaborating organizations.”
These efforts are merging to a certain extent with standards efforts in the bioinformatics community, such as MIBBI, the Minimum Information for Biological and Biomedical Investigations portal, he said.