Oracle this week threw its hat into the molecular analysis market with the release of Omics Data Bank, a platform that provides tools to pull in data from -omics experiments and link this information with clinical data from electronic medical records.
The product marks Oracle's first release for the -omics space and joins a number of products that the company already offers for the clinical market.
The new release lets users pull in molecular data, such as whole-genome sequence data or gene expression information, and then integrate it with clinical data from EMR's, Kris Joshi, Oracle's vice president of healthcare product strategy, explained to BioInform.
The Omics Data Bank is part of larger platform dubbed the Oracle Health Sciences Translational Research Center platform, which the company released last September.
The Translational Research Center was designed to help users normalize, aggregate, and analyze data from both internal and external sources, including clinical systems and lab systems, and use this information to identify new predictive biomarkers as well as best practices for diagnosis and treatment.
The Omics Data Bank is intended to enable researchers to integrate –omics data into their broader translational research projects. It was designed to import -omics data in a scalable fashion and to enable researchers to launch queries on this information, whether it's "raw data coming off the sequencing machines" or "annotated summarized data that a typical physician or investigator might look at," Joshi said.
A commonly expressed problem raised by groups attempting to mine EMRs is their often unstructured nature. Additionally, healthcare institutions deploy systems from different providers.
To get around this issue, Oracle has developed the Oracle Health Analytics Data Integration toolkit, which brings in clinical data from multiple systems provided by firms like Epic and Cerner, normalizes it, and then transforms it into a common vocabulary with a common semantic meaning for each data element, Joshi explained.
Additionally, the system maintains the data lineage and associated metadata, so that users can track where the data comes from as well as the transformation process between its system of origin and the Translational Research Center in order to alleviate concerns about the reliability of the analytical data, he said.
Similarly, Oracle has built direct adaptors that take genomic data generated by Life Technologies, Illumina, and Complete Genomics sequencers, as well as data from gene expression experiments, and convert it into a format that the system can consume, he said.
These adaptors are also used to bring in reference data from publicly available databases including Ensembl and the Cancer Genome Atlas, Joshi said, adding that Oracle intends to add other sequence repositories to that list.
Oracle expects interest in its platform to come from customers in pharmaceutical and biotechnology companies, contract research organizations, and academic research organizations, many of whom currently don’t have infrastructure in place for integrating clinical and genomic data and often attempt to develop internal solutions for that purpose, Joshi said.
These customers could use the platform to select patient cohorts based on particular phenotypes and clinical and genomic profiles, Joshi explained.
Additionally, they could create queries to find patients with a particular kind of cancer, for example, including what treatments they took as well as what tumor samples are available, whether they have been sequenced, and what variants were identified in the data, he said.
Joshi did not disclose pricing details for the system other than noting that it depends on the user's requirements.
The company also isn't disclosing how many customers are currently using the system.
Joshi said that Oracle intends to increase its -omics footprint over time and will release more products that are targeted at researchers in this market, although he could not provide specific details about the company's roadmap in the arena.
He did say, however, that Oracle does not intend to address all the data management and analysis issues in the space. Instead, the company will focus its efforts on areas that it is "well suited for," such as building "scalable backend infrastructure" that can handle petabytes of data.
He also noted that while there are products available in the market that mimic parts of Oracle's platform, "there is nothing like this that brings together public domain data sources, proprietary data, and clinical data all in one place" — in part because of the infrastructure that’s required, which is "fairly extensive."
The Oracle Translational Research Center runs on the company's Exadata hardware, a scalable package of servers, storage, networking, and software for storing large amounts of data. The Exadata line starts at 96 CPU cores and 1,152 GB memory for database processing and the high-end system includes 160 CPU cores and 4 TB of memory for database processing. The system includes usable capacity of up to 224 TB per rack.
This infrastructure lets users get the results of queries in "near real time," while internal solutions without the same type of computational backing might take much longer, Joshi said.
"We think this real-time aspect of the system and the scalability is really going to be a big change in the market," he said.
Furthermore, the company intends to partner with commercial and academic groups that are already involved in the -omics arena to "build the next set of applications" and to ensure that its platform can support commonly used tools, Joshi said.
Oracle is encouraging customers and collaborators to write applications that will run on the Translational Research Center platform and it will also intends to incorporate existing applications into the system, he said.
As part of these efforts, Oracle is working with researchers at the Inova Translational Medicine Institute who are using its platform as part of a project to sequence 1,500 whole genomes from children and parents in order to gain insight into pre-term births.
John Niederhuber, CEO of ITMI, told Clinical Sequencing News, BioInform's sister publication, that the project — for which Complete Genomics is providing sequencing services — is serving as a pilot for evaluating how whole-genome sequencing could be implemented within a healthcare setting and is the first step toward broader adoption of sequencing within Inova (CSN 9/14/2011).
In a statement this week, Niederhuber said that Oracle is "uniquely positioned" to support the complete integration of the clinical data and genomic data from the project.
Joshi could not provide more details about the partnership with Inova.
He said that Oracle is in discussions with potential partners for similar projects, but could not divulge further details.
Oracle isn't the only company attempting to combine genomic and clinical data in a single system. Last year, IDBS said that it was partnering with UK research hospital King's Health Partners on a nine-year project that aims to use patients' genomic and clinical profiles to develop individualized cancer therapies.
As part of the project, IDBS implemented a translational medicine-informatics platform at KHP's Integrated Cancer Center called the Oncology Research Information System, or ORIS (BI 3/11/2011).
Have topics you'd like to see covered in BioInform? Contact the editor at uthomas [at] genomeweb [.] com.