Skip to main content
Premium Trial:

Request an Annual Quote

Kaiser Permanente Taps BC Platforms to Analyze Biobank, Clinical Data


CHICAGO – As the largest private healthcare system in the US, one tightly integrated on both the insurance and care delivery sides, Kaiser Permanente has long been held up as a model for healthcare reform and outcomes improvement. Now, with the help of technology partners BC Platforms and Microsoft, Kaiser is looking to improve integration between genomics research and clinical care.

In May, Kaiser and BC Platforms won a Microsoft Health Innovation Award to integrate clinical and genomic data from multiple sites on the Microsoft Azure cloud platform to create a single "virtual biobank." The information will soon be accessible to researchers in all nine Kaiser Permanente regions nationwide, and eventually to people outside the organization.

The biobank, called the Kaiser Permanente Research Bank (KPRB), has 380,000 participants who have provided biological samples and completed surveys to help measure health risks. Alan Bauck, director of the KPRB Data Coordinating Core, said that the collection includes genomic data from Kaiser patients available for researchers, such as genotyping array data as well as whole-exome and whole-genome sequencing data, with smaller panels being the most common type.

Elizabeth McGlynn, VP for Kaiser Permanente Research and interim executive director of the KPRB, said that Oakland, California-based Kaiser is in the process of genotyping all participants in the biobank through arrays and merging information from electronic health records and surveys into the BC Platforms environment, though the COVID-19 pandemic has put the brakes on that effort somewhat. There will be some imputation because the genomic datasets are large.

"The notion is to create an enclave where researchers who are using the collection to do a variety of different types of genomics research can come into a secure environment and have the data and the analytic tools available," McGlynn said. This integrated virtual space will allow Kaiser to track whether researchers are using the data they were granted access to only for their stated purposes.

"What we were trying to solve was both the storage and then the availability of genomics tools that let us work across the enterprise to answer any number of different types of questions," McGlynn said.

Nino da Silva, executive VP for global business development at Zurich, Switzerland-based BC Platforms, said that Kaiser turned to the company to curate, normalize, secure, and manage its biobank data to support advanced analytics. He said that the project is mostly about unlocking the data in the biobank.

Large repositories like KPRB and UK Biobank suffer from having to devote much time to manual data curation, a necessary step in normalizing and formatting data for analytics purposes, according to da Silva, who is based in Singapore. "Curation is a bottleneck. It's the main bottleneck, actually," he said.

BC Platforms is helping to change that. Over the past two years, the firm has worked with Debiopharm to create BC|Match, an automatic curation system that compares datasets to standard frameworks like the Observational Medical Outcomes Partnership (OMOP) Common Data Model or analytics giant SAS.

"The system will automatically curate it and deliver an error list. Then you as a user can go in and teach the system how to interpret it the next time, and every time you teach the system, it becomes more and more automated until it delivers it to a high degree," da Silva said.

While BC Platforms works with all major cloud hosts, da Silva said that the Microsoft award was more about the vendor's implementation with Kaiser than it was about the technology itself.

Bauck said Kaiser chose the Azure cloud because it was not cost effective to install this type of computing platform in house. BC Platforms and Azure also provided the flexibility to dramatically expand the scope of this research environment in the future.

The KPRB data is locked in a secure environment, and researchers must apply for access for specific purposes. "The intent of the whole platform is to build a safe, high-quality, analyzable collaboration environment, and managing the data, of course," da Silva said, adding that the data never leaves Kaiser's control.

Researchers can search the database, build cohorts, then apply for access to the cohorts. "When they log in, they can only use the data according to the terms provided in the application, and once they are done, the synthetic cohort dissolves," da Silva said. This virtualized biobank meets US HIPAA and European Union General Data Protection Regulation (GDPR) standards for security.

According to da Silva, artificial intelligence and machine learning are rather useless without well-managed, high-quality data in a controllable environment. "You can have the most beautiful algorithm. You can have the most fantastic research concept," he said. "But unless you have data that is well managed and governed, you can't do it."

Da Silva said that BC Platforms has a pedigree in multinational collaborations, including the MultipleMS platform for multiple sclerosis research; the European Sudden Cardiac Arrest network: towards Prevention, Education, New Effective Treatment (ESCAPE-NET); and diabetes research platform IMI-SUMMIT. "In these projects over the last 15 years, we learned how to create combinations of data and securely manage them," he said.

While KPRB is not BC Platforms' largest implementation in terms of the number of research subjects, it is among the most substantial biobanks in the world, according to da Silva. Also, this was the first large-scale implementation the European software company has completed on the Azure cloud. It was a two-year effort that required the development of a new data model to support not only current needs but a future environment that can take more widespread whole-exome and whole-genome sequencing data.

"Once you have this platform with the ability to have this safe way of creating collaborations both internally and externally, then the next logical step for an entity like Kaiser is to create pan-American or global collaborations with different entities," da Silva said. This is possible because the data never leaves Kaiser's control, even during analytics stages.

Kaiser is one of BC Platforms' development partners, and innovations created for this installation are now available to other BC Platforms customers, as well.

Da Silva pointed to the rapid implementation of this platform as one innovation, given the size and geographic footprint of Kaiser Permanente. While work has been going on for two years, deployment started in early 2020 and should wrap up this fall. He expects that the system will go into full production next year, meaning that it will be available to all Kaiser researchers and outside collaborators.

Kaiser Permanente has 12.5 million members in nine US markets and reported 2019 revenue of $84.5 billion. The integrated healthcare delivery system encompasses a health plan, 39 hospitals, more than 700 outpatient facilities, and nine research centers.

Kaiser is using BC Platforms mainly on the research side of its operations, though. "This is about analyzing the data, building the platforms so we can analyze and use the data, not so that we're directly interacting with the patients with it," Bauck said.

McGlynn said that clinical data being used for research analytics is at least a couple of steps removed from operational data that informs care delivery. She said that the current model works best for research because Kaiser has put a substantial amount of work into assuring that the quality, validity, and completeness of its clinical data can answer a wide range of research questions. However, it is not out of the question that the research program evolves into something more clinically focused.

Research supported by the biobank has been primarily funded through traditional sources, especially the US National Institutes of Health. The biobank mostly supports discovery research, according to McGlynn, including work addressing whether specific genetic mutations can predict disease risk.

"What we're hoping [to achieve] with bringing all of this information together is that we can increasingly answer questions that have to do with the clinical implications of some of these discoveries for our membership," McGlynn said.

One key area of interest for KP Research is minority communities, including health disparities. McGlynn said that about a third of KPRB participants are nonwhite, but she wants to diversify the research pool further. "We know that one of the limitations of a lot of the genomics research that's been done to date is that it's been done primarily on [white populations], " she noted.

"If we want to serve our very diverse membership, we have to be very certain that the kinds of insights that are coming out actually are applicable to our diverse population" from a clinical perspective, McGlynn said. "We really feel very committed to making sure that we don't leave people behind as we advance our understanding of the role that genetics plays in disease risk and in treating those diseases."

McGlynn said that Kaiser is building a "sandbox" of sorts with BC Platforms so the biobank could test data integration in a controlled environment before introducing it to clinical applications, though that is not in the near-term plans.

While KPRB is a broad resource, Bauck said that the biobank has a cancer cohort and a pregnancy cohort. It has also collected some data about COVID-19 that it is working to integrate with a larger NIH effort to assess genetics and COVID.

McGlynn said that many biobank participants volunteered for altruistic reasons that might benefit future generations, not to realize any immediate benefit to their own health.

However, members could benefit in the short term from research into the connection between genetics and the environment. Mentioning the wildfires that have ravaged the US West Coast this month, McGlynn suggested that the data could help in studies of the effects of genetics on illnesses related to poor air quality that some Kaiser researchers are planning.

"We're in many different parts of the country and really have the opportunity to have a fairly rich dataset that will help us understand the degree to which insights that we get apply to a broader population," McGlynn said. She expressed hope that this information will be integrated into the health system's broader strategies for population health management.

"It's a long-term bet. It's not an immediate thing, but it really is trying to leverage our research to ensure that the data and information that's made for future clinical decisions is applicable to our diverse membership," McGlynn said.

"The thing we always talk about in implementation science is to try to shorten the timeline" between discovering an innovation that could help Kaiser members and moving it into regular use, McGlynn added. "How do we make that more seamless, the connections between research and the clinical and operational enterprise?"