Skip to main content
Premium Trial:

Request an Annual Quote

Data4Cure Building Business around Data-Driven Platform for Cancer Analysis, Biomarker ID


NEW YORK (GenomeWeb) – La Jolla, Calif.-based Data4Cure is seeking to build a business around a proprietary cloud-based platform that combines and analyzes multiple kinds of omics data to provide a more comprehensive picture of the genes and pathways and help users identify molecular markers associated with specific disease subtypes.

The so-called Biomedical Intelligence platform combines comprehensive genomic and clinical datasets culled from multiple published resources with a proprietary cell map, constructed from thousands of molecular measurements of cell structure and function covering multiple genes and gene interactions, protein complexes, and molecular pathways. It then uses a proprietary inference engine to mine customers' data —whole exome or genome sequence for instance— in the context of these internal datasets and its cellular map. It identifies the molecular machinery relevant to particular conditions or disease subtypes and associates changes in those systems to things like patient survival, drug mechanisms of action, and predictors of drug response, possibility of relapse, and so on.

It provides an information-rich context for exploring the effects of genetic changes in diseases on the cell's broader activities, such as changes to protein complexes and molecular pathways and functional consequences of these aberrations, Janusz Dutkowski, CEO and Co-founder of Data4Cure, told GenomeWeb. This systems-level approach to analysis sets the company's offering apart from systems offered by other commercial outfits that focus on the genome-level analysis alone, he added. It "shows you not just the genes but also the higher-level processes that might be affected in the disease," adding a "complementary layer [of information that] provides a guide for how to understand the patient's data [better]."

The cellular map is key to providing that big picture perspective, he added. It offers a hierarchical map that starts at the level of single genes and proteins, then protein complexes and pathway networks, and other larger macromolecular process. The system lets users overlay omics data from their patients onto that map and use the combined information to search for new or existing biomarkers. Furthermore, the map is updated automatically as relevant information becomes available in the peer-reviewed canon, ensuring that users can access and benefit from new research. That's also one of Data4Cure's distinguishing features as compared to other companies that hire experts to sift through the literature and manually curate the information that they provide through their systems, Dutkowski noted.

Initial applications for which the system is available, Dutkowski said, are cancer and biomarker identification. Its oncology offering is designed to provide customers with a comprehensive interpretation of their tumor data, he said. Customers can submit sequenced tumor samples to Data4Cure's platform and analyze that data in the context of datasets from the Cancer Genome Atlas and similar repositories. They can use the system to compare their input samples to a much larger patient cohort, identify somatic variants in the tumor, and explore pathways, networks, and cellular process that are perturbed in cancer cases, he said. The same analysis process holds true for Data4Cure's biomarker identification application. Customers of this service can use the company's tools to mine private and public data for network- or pathway-based biomarkers relevant to patient survival, treatment response, drug sensitivity, and so on.

Input samples to the system can be whole-genome or exome data, RNA-sequencing, or array information. Customers can also analyze single or multiple samples at a time. Turnaround times vary depending on the sort of analyses that needs to be run, the size and sequencing coverage of the sample, the number of samples to be analyzed, and other factors. For example, processing BAM files from scratch takes several hours, however, once the initial processing is complete, additional analyses, such as variant annotation, can take several minutes, Dutkowski said. Other analyses are much quicker. For example, users can also query the data, adjust their parameters, and then rerun their analysis in seconds or less.

Data4Cure officially opened its doors in 2013 and raised an undisclosed amount in a seed equity round from unnamed private investors in 2014. The bases for its technology was initially developed in the laboratory of Trey Ideker, chief of genetics at UCSD's School of Medicine, a bioengineering professor at UCSD's cancer center, and one of the company's founders, and are described in several published papers.

One of these, published in Nature Biotechnology, demonstrates the feasibility of using a combination of gene and protein network interaction data from Saccharomyces cerevisiae to infer an ontology of gene function that would be comparable to the Gene Ontology — the work in this paper forms the basis of the cellular map using in the company's analyses. The system also leverages methods of integrating sequence from somatic tumor genomes with gene network data to stratify cancers by subtype, as described in a paper published in Nature Methods, and bioinformatics methods of separating signal from noise. 

Data4Cure presented its platform publicly for the first time at the Personalized Medicine World Conference held last week in Mountain View, Calif. including case studies with two current customers and partners — the University of Washington Center for Cancer Innovation and the Institute of Systems Biology in Seattle, Wash. In addition to these clients, the company also has an unnamed pharmaceutical company as a client. Initial responses to the system at the conference were positive, Dutkowski said, with a number of attendees expressing interest in signing agreements with the company.

The company does not sell its solution to individual users, choosing instead to target entire institutions or pharma companies, Dutkowski said. It might, however, offer single-user access options at a later date, he added.

Besides analyzing their data, customers can also store and catalog multiple types of biomedical data within the company's system. Pricing for storage is arranged on an individual basis with each institution and the available space grows as needed with no cap, Dutkowski said. He declined to disclose further pricing details.