CHICAGO – After demonstrating the feasibility of a genomic data-sharing network, three major US pediatric hospitals are enhancing the technical infrastructure of their collaboration, seeking external funding and working to make the system scalable in hopes of welcoming additional partners in the future.
In an article published this month in Genetics in Medicine, researchers and bioinformaticians at Cincinnati Children's Hospital Medical Center, Boston Children's Hospital, and Children's Hospital of Philadelphia discussed early work on the Genomics Research and Innovation Network (GRIN), which they described as an "interoperable, federated, genomics learning system."
GRIN creators built an open-source system for querying genotype-phenotype databases at the three hospitals. Researchers can receive aggregate data on patients with specific genotypes and phenotypes in each biobank without the need for a central information repository to help increase cohort size for discovery purposes, particularly for rare conditions.
"We need very large-scale reference databases in order to interpret genomic information, " particularly in rare diseases," said one of the GRIN leads, Kenneth Mandl, director of the computational health informatics program at Boston Children's Hospital. "The study of them requires large numbers which can only be accumulated through collaboration across sites of care."
Studies with small cohorts of pediatric patients with epilepsy and short stature demonstrated the efficacy of the network, which turned up several new scientific discoveries.
The epilepsy pilot, according to the paper, led to the first description of epilepsy resulting from missense variants in the GABRG2 gene and the discovery of new variants of CACNA1E linked to severe developmental and epileptic encephalopathies.
That study also found de novo variants in AP2M1 via phenotypic similarity analysis based on Human Phenotype Ontology terminology. The researchers said it was the first such gene discovery based on harmonized and standardized phenotypic information for neurodevelopmental disorders.
The short-stature pilot sought to identify specific rare subphenotypes of the condition based on clinical characteristics "readily identifiable as discrete data elements" in the electronic health records, the paper said. The researchers found a genetic etiology through exome sequencing in three of the 10 subjects in the small study.
GRIN is based on the philosophy that participants establish process reciprocity and data interoperability whenever possible, the Genetics in Medicine article said.
Investigators sign a single material transfer agreement for sharing biospecimens when they register for GRIN rather than having to sign agreements for each research project. GRIN also has adopted National Center for Advancing Translational Sciences SMART IRB policies for streamlining project review by institutional review board.
The authors said that GRIN is aligned with the FAIR (Findable, Accessible, Interoperable and Reusable) data principles promoted by the US National Institutes of Health. GRIN also uses the US National Institutes of Health-funded Patient-centered Information Commons: Standardized Unification of Research Elements (PIC-SURE) application programming interface to manage distributed queries of local databases.
Infrastructure development focused on five steps. First, the partners adopted the National Patient-Centered Clinical Research Network's (PCORnet) Common Data Model to standardize vocabularies, including a model agreement calling on participating institutions to use the genomic variant call format (gVCF) format for exome and genome sequencing results, Health Level Seven International's Fast Healthcare Interoperability Resources (FHIR), and the Observational Health Data Sciences and Informatics (OHDSI) Observational Medical Outcomes Partnership (OMOP) data model.
Secondly, the three founding hospitals opened up the PIC-SURE API on EHR and genomic repositories. They then set up an authentication and authorization process for API access and, as step four, created a custom user interface for querying phenotypic and genotypic databases at the three sites.
Finally, they built an Amazon Web Services cloud-based data provisioning workflow to combine information across sites. This, the authors said, allows for aggregated data to be analyzed with common toolkits, including Jupyter Notebooks.
GRIN members said in the paper that their consortium adheres to the American College of Medical Genetics and Genomics (ACMG) 2017 position statement calling for widespread sharing of genotypic and phenotypic data.
"GRIN uniquely addresses the ACMG's call to action. It is unusual for three leading and often competing institutions to broadly and deeply collaborate and share data — particularly sensitive genetic data, processes, and patient populations," they wrote.
The researchers said that GRIN is "complementary and additive" to other genotype-phenotype reference databases, including Online Mendelian Inheritance in Man (OMIM), ClinVar, the National Human Genome Research Institute's Genome-Wide Association Study Catalog, the Genome Aggregation Database (gnomAD), and All of Us.
With regard to All of Us, GRIN adds continuous updates to longitudinal phenotypes, and the ability to query aggregate data across all participating locations.
The idea for GRIN dates to early 2015. Funding came through about 3.5 years ago. The effort caught the attention of top leadership at all three hospitals early on, and the CEOs personally approved funding for GRIN.
"They saw an opportunity to leverage the investments they had already made to address issues that by themselves they could not address. No matter how big each one of our hospitals gets, we still don't have enough cohort size to address some of these key issues for rare diseases," said another GRIN leader, Tracy Glauser, associate director of the Cincinnati Children's Research Foundation at Cincinnati Children's Hospital Medical Center.
The CEOs wanted to "do it in a way where the whole was greater than the sum of the parts," Mandl said.
"There are 7,000 rare diseases which individually may not sound like a lot, but in aggregate form a substantial impact on both pediatrics and general public health. And although the three institutions may compete clinically, we felt the time was right in genomics to collaborate in order to enhance our opportunities for discovery and enhance our opportunities to make transformational impacts," Glauser said.
Mandl said that GRIN is distinct from other data-sharing initiatives in that is not only federated, but also self-governed. "We have found that this kind of self-governance model has so far at least at the scale that were operating at now — three hospitals — worked extremely well," he said.
GRIN does set rules for participants, which Mandl said help make the network functional. All data stay local, with some exceptions made on a project-by-project basis. Plus, all data must follow IRB-approved research protocols for patient consent.
"From a governance perspective, it's a club model," Mandl said. "The only people who have access to these data are researchers employed by our institutions who are governed by the rules of our institutions and agree to get the data under those circumstances."
The IT infrastructure allows data to reside locally, but enables users to query across systems and then combine the results in one place for specific projects, Mandl explained.
"Our general approach was that we standardize within an institution so that we can harmonize across institutions," Glauser said.
For IRB consent forms, the three partners agreed on key principles. "But we didn't mandate that the same language be used across the three institutions," Glauser said.
"But then when you try to put those samples together and develop a research protocol, it's door-to-door combat for each IRB protocol to figure out what was consented, what was allowable, what could be shared, what's identified, [and] what's deidentified, and the effort to do that often exceeds the effort to complete the science," Mandl said.
For that reason, the GRIN consortium decided to start with uniformly consented cohorts. However, data sharing often is difficult across EHRs, even those made by the same vendor, simply because of how each system is implemented and the preferences of individual institutions. Boston Children's has a Cerner EHR, while Cincinnati Children's and CHOP run Epic Systems technology.
Thus, GRIN continuously updates data pulled from EHRs and stores pointers to this information in a common ontology, without having to move the records themselves out of the host institution's servers.
"We regularly update our local nodes, which gives us a longitudinal phenotype over time, so that the phenotype information accumulates on the enrolled patients without further effort as a byproduct of the care system," Mandl noted.
Since the pilots wrapped up, the GRIN partners have applied for an NIH R01 research grant to continue their work on identifying new causes of "severe" pediatric growth disorders, according to the Genetics in Medicine article.
GRIN eventually will not be limited to pediatrics. "A desired end state is to allow hospitals to leverage their existing care delivery processes and information technology (IT) structures to acquire and share digital health record data, biospecimens, and a range of omics measurements that are made during the course of clinical care or under research protocols," the researchers wrote.
The founders also plan on inviting in new partners to participate as long as access and data are kept "controlled and safe," Mandl said.
"We continue to look for new opportunities, both internally among the three of us and potentially partnerships outside of the three institutions to even scale it bigger," Glauser explained.
"Everything we've developed within GRIN is designed to be modular and reproducible at the other sites so the IT infrastructure can be installed in a day or two," Mandl added.