Boston Children's Hospital and Cincinnati Children's Hospital Medical Center will use an $800,000 grant from the National Institutes of Health to develop and use informatics infrastructure to mine pediatric data stored in electronic medical records housed in both institutions.
The study is one of several research endeavors in the second round of the Electronic Medical Records and Genomics, eMERGE, project. Last June, the NIH's National Human Genome Research Institute set aside $1.6 million to fund between one and three pediatric studies in eMERGE phase II (BI 8/19/2011).
Besides the CHB/CCHMC grant, the Children's Hospital of Pennsylvania has also been awarded an eMERGE grant, which it is using to mine the EMRs of more than 40,000 children in its database (BI 6/8/2012).
According to the grant abstract, the Pediatric Alliance for Genomic and Electronic Medical Record Research, or PAGER, comprising CHB and CCHMC, is planning a "sustained, scalable effort to inform and improve the care of the individual child using relevant genome- and phenome-wide association study data."
Isaac Kohane, director of CHB's informatics program and a professor of pediatrics and health sciences and technology at Harvard Medical School, told BioInform this week that the partners intend to show that "we can take defined phenotypic characterizations, define them as these queries against electronic health record data, and in real time find how many patients meeting a variety of criteria exist across our two institutions."
Part of that effort will involve ensuring that the necessary institutional review board requirements and guidelines are met since patient clinical and genomic data is being collected at two separate sites, he said.
PAGER will also jointly develop guidelines that will direct which research results should be reported back to patients and the best ways of presenting that information; as well as come up with methods of addressing any attendant ethical issues, he said.
As part of that process, PAGER will explore "attitudes towards, and use of, clinically relevant and incidental genomic findings in patient and control groups," the abstract states.
Additionally, the partners intend to explore ways of inserting pharmacogenetic response data into a clinical decision support system that will be used along with EMRs in clinical settings, Kohane said.
Although both hospitals have indicated interest in studying specific phenotypes, for instance inflammatory diseases and pediatric obesity, the eMERGE consortium is still settling on which ones will be priorities for the network, Kohane said.
For eMERGE's second phase, the consortium has said it intends to identify genetic variants that are associated with more than 40 disease characteristics and symptoms (BI 8/19/2011).
To mine the EMRs at their respective institutions, the PAGER researchers plan to apply existing infrastructure built at CHB that's based on the Informatics for Integrating Biology and the Bedside, i2b2, platform, Kohane told BioInform.
The first tool in PAGER's arsenal is the Shared Health Research Information Network or SHRINE, which eliminates the need for centralized databases because it enables researchers to run distributed queries across independent data repositories situated in multiple health centers and extract "accurate aggregate results," Kohane explained.
Since its development, the tool has been used to develop nationwide registries for pediatric rheumatoid arthritis and pediatric Crohn's disease, he said.
Similarly, for the eMERGE study, "we are not actually going to create a single central Cincinnati/Boston database. Each of our hospital systems will maintain their own separate databases but will be able to find out how many patients meet various phenotypic criteria through this query system," he said.
According to the abstract, the hospitals' databases contain a combined total of 2.5 million EMRs. When data is pulled from both hospitals' EMRs, it is transformed, de-identified, augmented with research and legacy clinical data, and then linked to the PAGER biorepositories and associated GWAS data.
The abstract also states that both institutions are in the process of implementing biorepositories that hold about 15,000 pediatric samples in total.
Kohane's team plans to add a natural language processing tool to SHRINE, so that it can extract relevant clinical phenotype data stored in narrative text in the EMRs so that "we can actually have full phenotyping for these genetic studies," he said.
He further noted that since the tools are based on i2b2 infrastructure, which has been deployed in about 60 academic health centers, they could also be of use in similar research efforts on a much larger scale.
The group will also contribute 6,861 pediatric cases that have EMRs, DNA, GWAS data, and institutional certification to the database of genotypes and phenotypes, dbGaP.
In addition to the funding provided by the eMERGE program, CHB and CCHMC plan "major investments in the infrastructure supporting eMERGE II," valued at about $50 million "in aggregate over the next five years," the abstract states.
On that front, CHB is investing in its Gene Partnership program, Kohane said
Among other goals, the program aims to combine genetic, clinical, and personal data in a privacy-ensured and ethically appropriate format; and come up with ways of providing patients with research results that are pertinent to their health. The program also intends to gather patient samples so that its researchers can look for links between genes, environment, and complex genetic diseases.