NEW YORK (GenomeWeb) – The Children's Hospital of Philadelphia recently launched a new scientific center that aims to gather and share genomic, clinical, and other useful biomedical data for pediatric cancer and rare disease research.
The so-called Center for Data Driven Discovery in Biomedicine is a joint project of the CHOP Research Institute and CHOP's Department of Biomedical Health and Informatics. It is co-directed by Adam Resnick, an assistant professor of neurosurgery at the University of Pennsylvania School of Medicine and Phillip Storm, CHOP's division chief of Neurosurgery.
In addition to offering co-localized data and compute power, the center will seek to develop open models of collaboration, data sharing, and scientific integration. Investigators from around the world will be able access to research and clinical datasets through open platforms and will have opportunities to collaborate with a wider pool of clinicians and scientists, according to CHOP.
GenomeWeb talked with Resnick about the center's goals in greater detail and a planned pediatric genomic cloud that's modeled after the NCI's Cancer Cloud pilots initiative. Below is an edited version of that conversation.
Why is CHOP launching this center?
Like any other research hospital, at CHOP we've always been interested in finding the best way to treat children for various diseases. Despite the fact that cancer is the leading cause of childhood disease-related death, by comparison to the adult setting, it's still a rare disease. The advent of next-generation sequencing technologies and the rapid capacity to begin exploring the underlying cause of diseases has transformed the therapeutic opportunities across all diseases in adults and pediatrics.
With that has emerged a dramatic need to begin thinking about how it is that data that comes from these efforts, [how it] is generated, and how is it best empowered on behalf of individual patients? This is not entirely unique to the pediatric setting, but what is unique is that by and large such efforts require large amounts of data that need to be integrated and brought together for analysis. We need to figure out a way to bring together researchers and clinicians to work on data [and] to bring their own data. The NIH has already begun pilot initiatives for cloud data deposition and integrated analysis, but unfortunately the pediatric community does not have a robust infrastructure to be able to do this, requiring institutions like CHOP to find ways to drive collaborative efforts on behalf of children.
Has any funding been made available for the Center?
This center builds on an ongoing strategic plan that we've had at CHOP that leverages collaborative pediatric consortium initiatives include[ing] more than seven institutions that collaborate to bring biospecimens into a common repository for analysis, as well as a clinical trial consortium that includes more than 15 institutions. These kinds of efforts are partly sponsored by the institutions themselves, but then a large portion of the funding also comes from CHOP itself and more than 50 foundations who have recognized the limited resources that are available for pediatric research. So, it's a combination of both substantial institutional support from CHOP for the center as well as long-term support and commitment from philanthropic partners, foundations, and partnered initiatives.
One of the things mentioned in the release is that the center would operate under an "open science model." What would that look like for you?
Currently, even though academic science at its core is coordinated around the idea of distribution of findings through publications, we are still obliged to compete with each other for limited resources for grant funding. That has been a successful model for a long time, in which the NIH and other granting agencies drive competition in order to elicit the best science. The challenge that we end up facing is that particularly as emerging technologies come online, the amount of data that can be generated describes a different kind of science where the generation, distribution, and interpretation of data at this scale is different than the traditional model of the individual investigator and their granting effort. It's very challenging to navigate the combination of competition for your own grants and publication and primacy of discovery, and at the same time engage wholeheartedly in collaboration.
The open science model really engages a different underpinning in which the primary question that one asks ... is what is actually best for the patient? That means that the primary directive is no longer to publish first or get your grant from NIH or other granting agency before somebody else does. The primary directive is to make the data available as rapidly as possible to as many people who could potentially be as impactful in using it. No longer can you follow traditional guidelines of moratoriums on release or protected data for a period of time that would allow an investigator to publish their findings before somebody else has access to it. It becomes challenging to justify waiting to provide access to "potential" competitors in order to participate in traditional academic models.
What are some specific things that the CHOP center would do to enable open science?
The pediatric cancer community has struggled getting sufficient funding historically from the NCI — only four percent of the NCI budget goes to them. So to address such deficits they already engage in collaboration through consortia who are working together out of need. The first initiative that the center is now driving is empowering those consortia to co-localize, integrate, and work on these data. One of the challenges is that in contrast to adult cancers, where efforts like [The Cancer Genome Atlas] provide a tremendous backdrop against which precision medicine can be practiced because the data are NIH-generated and publicly available, much of the pediatric data that is available is not generated under NIH support ... and largely not shared or curated, and is only available through accession from storage repositories like dbGAP. They are available upon publication, but even then there's a tremendous mandate for secondary analysis and integration with one's own data, and the only way currently to do that is to try and download that data locally. The center's first initiative is not only to empower data that CHOP is involved in generating ... but then to provide the public space to analyze and work on that data together.
The next initiative, which is ongoing already, is a data-wrangling initiative. We're bringing together pediatric and adult public datasets ... for co-location with computational access in a cloud environment. So, one of the first outputs of the center will be ... the launch of the pediatric genomic cloud platform in partnership with industry partners and consortia. For us this is a mandate — to be able to empower the limited pediatric data that is available [and] allow people to integrate that data across cancer types and across the adult and pediatric space.
In addition to the genomic data integration initiatives are efforts that are initially disease-specific, largely focusing on brain tumors, and then expanding to other disease types, [which] are bringing in highly annotated biospecimens that are not only available to the individual investigators that are collecting them, but to the rest of the world in terms of their capacity to integrate and propose projects on what are rare and precious biospecimens. Not only is the computational and data access an issue, but by virtue of the disease types that pediatric populations encounter, getting enough biospecimens together to perform a project is a challenge, unless you drive collaborations. We've been able to do that successfully [at CHOP] through partnerships such as the Children's Brain Tumor Tissue Consortium. Traditionally these types of biorepositories are closed to the rest of the scientific community ... but the mandate of the center is to try and engage as many people as possible in doing research on pediatric diseases.
The other initiative is to build the data visualization and integration space. Not only is the raw data not widely available in a format that's easily accessible for computation, even the analyzed data is not widely available. We have partnered with MSKCC, Dana Farber, and Princess Margaret Cancer Centre to take the cBio portal platform and modify and improve that platform in an open-source fashion in a way that not only allows people to view and integrate pediatric and adult data but connect that data to biospecimens and ultimately into the cloud environment.
The pediatric genomic cloud sounds very interesting. Are you modeling it after the NCI's Cancer Cloud pilots?
We are. That's something that's undergoing development right now, and we'll roll it out very shortly. We'll be happy to talk more about it then. It's a key space where [our] partners in industry are allowing the pediatric community to really take advantage of computation infrastructure in the cloud including Amazon Web Services and Google. This will not be a pediatric-only cloud. It's an environment where pediatric data can be integrated with existing adult datasets.
Are there specific pediatric diseases that the center will focus on?
The first projects focus on pediatric cancers, [with] another set of projects coming down the line focusing on epilepsy. We are leveraging consortium initiatives and efforts that CHOP is participating in, which include epilepsy, genomics, and cancer genomics efforts. But that is going to expand further to the microbiome and almost any other molecularly-informed disease space. Even for standard healthcare, [such as] well-child initiatives, we can use the power of population genomics and other types of efforts to inform healthcare. But initially, we are really focusing on harnessing the powers of next-gen sequencing and molecular profiling for cancer — so there will be both germline as well as somatic mutations in cancer subtypes that will initially be analyzed.
In the pediatric community, often times there's a deficit in investment because of small market sizes and more limited interest in child-specific therapeutic development. Something that may be a very prevalent mutation in the pediatric context may be a rare subtype in the adult context. [But] even in the adult space, while a large percentage of a particular subtype might be defined by a recurring mutation, there will be a very long tail of prevalence in terms of frequency that really defines rare subtypes of a prevalent cancer. And so the rare disease challenge ... will no longer just be a rare disease by the traditional formulation, because any prevalent cancer will also have rare cohorts. Discovery in this space can only be empowered through big data analytics. That's what the pediatric community brings to the table. We've already had to address the challenge of collaboration head on . . . by necessity.