Skip to main content
Premium Trial:

Request an Annual Quote

Geisinger Expands Preventive Genomic Screening Capabilities With Platform Supported by AWS Grant

Premium

NEW YORK – Geisinger College of Health Sciences hopes to advance its clinical genomics capabilities with a scalable population genomics platform that it is building with the support of a $175,000 IMAGINE grant from Amazon Web Services (AWS).

The platform, which will be used internally by researchers and physicians at Geisinger, builds upon the company's MyCode Community Health Initiative, which currently has genomic information on about 230,000 people within the Geisinger health network. The new data platform will also be more automated and user-friendly than the current version of the MyCode platform, enabling more of the network's researchers and healthcare providers to use it.

The IMAGINE grant awarded to Geisinger consists of $150,000 in unrestricted funding, $25,000 in AWS Promotional Credits, and engagement with AWS technical specialists.

Geisinger, which was acquired last year by Risant Health, itself a nonprofit subsidiary of Kaiser Permanente, plans to use the unrestricted funds to hire a senior software engineer to the project and to use the $100,000 in AWS Promotional Credits to support the deployment of the platform in its pilot stage.

Kyle Retterer, chief data science officer at Geisinger, said that despite the wealth of genomic data stored in the MyCode database, its access and use required extensive programmatic and technical expertise, which hampered its scalability.

"We have computing systems, data storage, etc.," he said, "but it's not easy to use, and it had been pretty bottlenecked to a small subset of users who really knew how to work with the data, and anyone else who wanted to get at that data had to go through them."

To improve upon this, Geisinger has automated many of the data processing pipelines contained within the new platform and been building a more user-friendly interface to improve its accessibility and ease of use.

"It has a lot of powerful features to make life easier for everybody who wants to interact with the data," Retterer said, "and it also gives us the clinical audit trail that we need to actually use this for any clinical decision-making."

The Pennsylvania-based health system hopes that this platform will eventually enable it to adopt a more genomics-first approach to population health that will help its researchers and physicians better understand the phenotypic variability and penetrance of many Mendelian conditions.

"When you start to look at genetics first," Retterer said, "there [are] people who have what looks like clear-cut pathogenic variants, but when you look at their medical record, they don't have any phenotype that corresponds with that."

One key component of this effort will be to continually update all genetic annotations in the database, as more information comes to light. Here again, Geisinger is working to automate these processes where possible, as annotation updates is one of the more time-consuming aspects of genomic data curation.

"We're definitely thinking of this whole thing as a platform to make genomic information reusable over the life of a patient — easily accessible and queriable, and readily annotated," Retterer said.

Understanding subclinical cases and cases that simply don't manifest in any classical sense is going to be a "real challenge," Retterer said, as genomic medicine expands. Developing tools and systems such as Geisinger's new data platform, which can combine genomic and phenotypic data will, he added, help lay the groundwork for "truly" genomics-first medicine.

Geisinger's platform is built upon an underlying Databricks platform and uses a Databricks lakehouse architecture that combines the key benefits of data lakes (large repositories of raw data in its original form) and data warehouses (organized sets of structured data) to store and index all project data.

Geisinger completed the genomic data lakehouse and data pipelines by the end of last year and aims to launch a web application for research data queries and analysis within the first half of the current year. The company expects to have the clinical workflow for the new database up and running by the end of the year.

Geisinger expects the new platform to subsequently enable the expansion of current MyCode clinical screening to hundreds more genetic conditions for up to a million patient-participants. Longer term, the company anticipates being able to integrate the genomic data platform with its Epic EHR, thereby enabling real-time use of genomic information in care delivery. Doing so is expected to improve patient care and outcomes by helping to prevent and better manage disease risks, provide earlier diagnoses for existing diseases, and enable more precise treatments for genetic and nongenetic conditions.

One potential risk that the company foresees is the possibility of being unable to directly integrate the platform with Epic EHR. Although Epic provides open APIs and has implemented a basic genomic storage system known as "Genomic Indicators," it is not yet clear whether those tools will be sufficient to meet the needs of this project. Geisinger is currently working with Epic's genomics team to mitigate those risks.

As the new database project evolves, Geisinger hopes to concomitantly expand its MyCode program, both in terms of active sites and the number of genetic conditions screened for.

Retterer said that Geisinger has returned approximately 5,000 results spanning 75 conditions to patients in the MyCode program. The company hopes to add an additional 100 genetic conditions to that platform.

Retterer said that the details of this expansion remain a bit "fuzzy," as much still needs to be worked out.

"We're definitely looking to expand and accelerate the MyCode program, though," he said.

The project with AWS will help further that goal, he said, by enabling Geisinger to use greater amounts of genomic data more economically.

Retterer said that the opportunity to expand the MyCode program was a big part of what drew him to Geisinger.

"We've shown this works at small scale," he said, adding that the question now is how to "turn the volume all the way up" to being able to analyze whole populations for potentially every Mendelian condition.

"I don't know how long it'll take us to do this," he said, "but that's the goal in the end."