Skip to main content
Premium Trial:

Request an Annual Quote

Syapse Providing 'Free the Data' Software Infrastructure to Enable Data Mining, Visualization


Originally published July 30.

Syapse, a company that develops software products for omics-based diagnostics, will provide software infrastructure for the Free the Data initiative that will enable patients to see their genetic data compared to others and allow clinicians to share variant interpretations with the broader research community.

Through the Free the Data effort, the founding organizations – including health advocacy organization Genetic Alliance, the University of California San Francisco, and genetic testing firm InVitae Corporation – are encouraging patients to get their genetic test reports from their healthcare providers or counselors and submit the information to an open-access database called ClinVar, hosted by the National Center for Biotechnology Information. The initiative is focused on submitting data from genetic test reports that show whether an individual harbors mutations in BRCA1 and BRCA2 genes.

In joining the Free the Data initiative this week, Syapse said it will provide infrastructure that enables data mining, visualization, and reporting. Those submitting data on BRCA gene variants will be able to see their data in comparison to others in the database. Healthcare providers will be able to learn information on whether BRCA variants carried by their patients confer a heightened risk for hereditary breast and ovarian cancer. Researchers can explore the variants and the evidence on them, and apply the knowledge to their own studies.

A spin-out of Stanford University, Syapse was founded in 2008 and is based in Palo Alto, Calif. According to Syapse President Jonathan Hirsch, the Free the Data movement will utilize the company's semantic computing platform, which stores biomedical information. Free the Data will also run two of Syapse's applications on this platform – one that allows the organizers of the initiative to load, store, structure, and curate the collected data; and another that physicians, researchers, and patients can use for interactive views of the genetic information.

Genetic Alliance and its collaborators launched the Free the Data movement last month as an extension of Sharing Clinical Reports, a project in which the University of California's Robert Nussbaum and other volunteers have been reaching out to cancer clinics and asking them to submit de-identified data from BRCA test reports (PGx Reporter 4/19/2013).

These data-sharing efforts are particularly necessary when it comes to BRCA testing, Nussbaum and others believe, because for almost two decades only one firm, Myriad Genetics, has provided BRCA testing. Seven years ago, the firm stopped sharing data on BRCA variants and their association to disease in an open-access repository, and began storing the information in a proprietary database (PGx Reporter 11/2/2012).

As such, other labs and researchers have limited knowledge about rare BRCA variants and their association to cancer. With efforts such as Sharing Clinical Reports and Free the Data, researchers and patient advocates are trying to collectively build up the knowledgebase around BRCA gene mutations, discouraging others in the life sciences field from storing genetic data as trade secrets, and speaking out against diagnostic monopolies in general.

"The Free the Data initiative comprises many laboratories and medical centers, from UCSF and Geisinger to the ICCG consortium, and data sharing of clinically relevant mutations is the main goal of the initiative," Hirsch said. "Syapse provides distributed, cloud-based software for managing, mining, and utilizing complex biomedical data, along with collaboration features, to enable sharing between all partners in the initiative and with the broader biomedical community."

Genetic Alliance and the organizers of Free the Data have assured that patient privacy is not a point of concern in this initiative. Genetic data stored in ClinVar is de-identified. Through a privacy and permissions infrastructure, called "Registries for All," those who decide to participate through Free the Data can control who can access their information.

The privacy protections in Free the Data "enable individuals to submit their data and control exactly how each component of their data is viewed and used down to a very granular level," Hirsch said. "We will utilize each individual's privacy and permissions settings as captured in 'Registries for All' to define with whom each individual's data can be shared," for example, if the person wants to share their data "with Genetic Alliance only, with academic researchers but not companies, or with everyone."

Myriad has expressed concern, however, that sharing data in public projects could result in a privacy risk. In a June 20 letter published in the New York Times, Myriad CEO Peter Meldrum wrote the following: "We believe that medical decisions based on the interpretation of genetic data are crucial to the well-being of patients, and we know of numerous examples in which the clinical use of unregulated public databases has jeopardized patient safety or privacy. Patients alone should ultimately have the right to decide whether their personal genetic data is deposited into public or government databases. At Myriad, our policies emphasize patient privacy and safety through our regulated laboratory process."

Several studies have found that data privacy is a challenge and re-identification is possible in public sequencing projects that analyzed the whole genomes or exomes of individuals. However, ClinVar will store only variant data, which according to experts, doesn't carry the same privacy risks as storing whole genome or exome data.

Through Free the Data "we will only be storing variant level information, similar to what is in ClinVar, not the underlying sequence or whole genome or exome information," Hirsch noted.