A team at Harvard Medical School-Partners HealthCare Center for Personalized Genetic Medicine has developed a suite of tools, dubbed GeneInsight, to deal with bioinformatics challenges associated with analyzing next-generation sequence data for clinical diagnostic applications.
Developed over seven years, GeneInsight was designed to address the challenges associated with incorporating genetic test information in patients' care. These challenges include the need for streamlined clinical testing processes, data management, and a means of interpreting the genetic data and providing physicians with results in a clinically useful format.
Composed of three components — GeneInsight Lab, GeneInsight Clinic, and GeneInsight Networking — the suite provides tools to assist genetic testing laboratories with storing and managing genetic variant information; creating interpretative reports; and providing an electronic data transfer hub that transmits results between testing labs and clinicians.
Additionally, GeneInsight includes an interface that lets healthcare providers access patient genetic information, and delivers patient-specific alerts when new information is entered into a source knowledge repository, thus keeping clinicians up to date on their patients' genetic profiles.
Following the Bio-IT conference held in April this year, Sandy Aronson, executive director of IT at PCPGM, told BioInform that his institution was transitioning from ChIP-based sequencing to targeted next-generation sequencing for diagnostic testing, making it necessary to update what had hitherto been a research-focused pipeline.
He noted that although targeted NGS technologies are expected to have a long "lifespan" for complex genetic testing, they will ultimately be replaced by whole-genome sequencing as the price of the technology drops.
As such, PCPGM is "focused on putting in place the infrastructure needed to launch whole-genome sequencing clinical assays."
For example, as reported last month by BioInform sister publication Clinical Sequencing News, in addition to the GeneInsight suite researchers at Partners Laboratory for Molecular Medicine have developed a internal curated database of variants, which they use in their CLIA laboratory for sequencing-based diagnostic tests for cardiomyopathy and deafness (CSN 07/06/2011).
BioInform spoke with Aronson this week to discuss GeneInsight's development and the need for tools that are designed for clinical interpretation of genomic data. What follows is an edited version of that conversation.
Are there any systems that are similar to GeneInsight available commercially or for free?
Not that I know of.
Is that because developing systems to analyze next–generation sequence data for clinical use is still a relatively new field?
Partners Healthcare realized many years ago that genetics and genomics would be something that could fundamentally improve the way clinicians practice healthcare. They also realized at the time that a significant amount of infrastructure would need to be developed to allow genetics and genomics to scale [up]. Standing up and distributing healthcare infrastructure takes time. So there was a decision made to begin working in this area proactively before the use of clinical genetic testing became widespread. Our goal was to not only stand up infrastructure for ourselves but also to distribute it to other users of clinical genetic tests. So we actually got started in this area very, very early.
In addition to starting early, we also have the advantage of working very closely with the clinical professionals within the Partners Laboratory for Molecular Medicine – a laboratory charged with integrating the latest technologies and discoveries into clinical care. Our system has been constantly refined based on their cutting-edge effort including their work launching clinical next-generation sequencing-based testing. More recently, as additional labs and clinics have come on board, we have learned from their use cases as well.
Was GeneInsight originally intended for research purposes and then modified to analyze data for clinical applications? What’s the difference between the two types of systems?
GeneInsight actually was designed and built for clinical use and was used for a long time clinically before it was every used for research. More recently there have been some uses of it for research purposes. When you look at different parts of the process flow, there are different requirements for research and clinical systems. When you are looking at the interpretive step and the step of maintaining a laboratory case history and so on, the main difference between research and clinical is that research applications tend to focus on de-identified data. There are some clinical uses for exchanging de-identified data but within clinical systems, it’s often important for that data to be fully identified so that you can make sure there is no mix-up and that the right data goes to with the right patient.
Let’s talk about how GeneInsight works and the components that make up the system.
GeneInsight’s role in the process starts after the physical laboratory process is complete. A patient sample will be sent to us, the regions of the DNA that are being tested will be isolated and sequenced, and any variants identified in those regions will be confirmed. In the case of the Laboratory for Molecular Medicine, that happens through our [Gateway for Integrated Genomic-Proteomic Applications] system and its case-management module. Then, it becomes important to draft a report that explains what’s known about the clinical implications of those variants and that’s where GeneInsight comes into play.
GeneInsight has three high-level components: a laboratory component, a networking component, and a clinical component. The laboratory component, which we call GeneInsight Lab, in turn has two sub-components. One of those components is a knowledgebase which stores information on the tests that a laboratory offers, the genes that are covered by those tests, the variants that are known to exist in those genes, and the state of knowledge linking those variants to disease states or drug dosing, efficacies, and response. Then there is a reporting engine which the laboratory can configure to pull information from the knowledgebase about those variants, when [they] are identified in patients and then ... automatically draft reports that are concise, information-dense, and patient-specific, which is critical for the clinicians who order the tests.
The network infrastructure connects the GeneInsight Lab instances to what we call the GeneInsight Clinic instances. The clinic instances are used by treating clinicians to track what tests have been run on their patients and what variants have been identified through those tests. Because of the networking infrastructure, clinicians are kept up to date as the laboratory certifies new knowledge on variants that were previously reported in their patients. This happens when a laboratory geneticist goes into GeneInsight Lab and makes an update to a variant — for example, changes its classification relative to hypertrophic cardiomyopathy from “unknown significance” to “presumed pathogenic.” They make that change once and then the system generates an alert for patient records in GeneInsight Clinic where that variant was identified. So it’s a way for the doctors to stay up to date on the implication to their patients' genetic profiles over time, which is very important.
We have just submitted a manuscript where we looked at the frequency of these events and we actually found that for the GeneInsight Clinic instances that are live today, every month, an average of 0.46 percent of cases have received knowledge updates meeting our importance threshold for a medium- or high-level alert. That’s a fair bit of utility when you are looking at patients who potentially have life-threatening diseases where this information is important for their clinical care.
How can users access the system? Is it possible to install it locally?
To date we have always set up an instance of GeneInsight for each new laboratory or a new clinical organization within our environment and hosted it for them. While we serve as the host they are all separate instances, each with its own database schema separate from all the other instances. Organizations then access their instances through a virtual private network.
We would be open to installing GeneInsight within a remote organization but from an economic perspective, thus far it has always made more sense for us to host it.
Does that mean that you’ve had to increase your compute infrastructure to host all those instances?
There is a quite a bit of infrastructure that was required for the hosting environment for making sure that there is adequate redundancy, adequate security, and adequate data separation between the different instances. We do maintain quite a bit of storage within our infrastructure to maintain our systems in general.
GeneInsight itself is not our largest consumer of storage because the laboratories themselves generate so much data that is condensed down to variants before it flows into GeneInsight. However, we are seeing broader and broader spectrum sequencing where we are identifying more and more variants in patients, which will continue up to the day when we transition to whole-genome sequencing. As we move down that path, more storage will be needed to support the GeneInsight instances.
Are you planning to include additional functionality to the system?
There is constant development of the system. We have a number of developers dedicated to enhancing the system. We will be enhancing the system to support whole-genome sequencing. There are a couple of electronic health record integration efforts underway that will enhance interfaces to the Partners Healthcare and Intermountain Healthcare EHRs. We are working on enhancing the networking structure within GeneInsight to enable new forms of communication between laboratories and clinics. We are looking at integrating the ability to handle clinical trial-related alerting.
Following up on your comments about preparing to handle whole-genome sequencing data, what would need to be in place?
There are a couple of things needed for that. We are still determining what parts of the process GeneInsight itself should support and in what places we should integrate with other efforts that are out there. WGS will generate between two [million] and five million variants per patient. We will need to annotate those data points. Our goal will be to get as much information about each variant as possible. Once we have that, we need to filter, sort, and generate scores based on that information so that we can identify a limited number of variants that are most likely to be clinically causative relative to a given indication so that those variants can be assessed in depth by geneticists and related professionals. Then we need to be able to track those variants as more is learned over time and keep folks up to date as more is learned about their implications and significance.
There is a fair amount of infrastructure involved in these processes and we are actively talking with a number of other companies, organizations, and groups that work in this space. Our goal is to determine where there are opportunities to partner and which parts of this equation we really need to build out in GeneInsight to be effective.
One thing that we know will be important for GeneInsight over time is the number of variants that it manages will continuously grow and we need to help clinicians stay up to date as new information emerges on them.