NEW YORK (GenomeWeb) – Researchers involved in the Global Alliance for Genomics and Health's Beacon project are partnering with the European Life-Sciences Infrastructure for Biological Information (ELIXIR) initiative to establish beacons at multiple sites that will make it easier to search European genomic datasets, and to develop protocols for securely sharing phenotype data.
The GA4GH's Beacon project provides infrastructure that enables participating sites to share genomic data without compromising its security. Beacons are essentially a network of servers installed locally by institutions that external users can query to determine whether an associated dataset contains information about a given genetic variant. To date, 70 sites worldwide have set up beacons including seven in the UK and another nine across Europe.
ELIXIR, for its part, provides resources and infrastructure to help life science organizations across Europe manage their biological data. In 2015, the initiative received €19 million in funding from the EU to support efforts to implement ELIXIR programs. In total, ELIXIR has 20 nodes across Europe that are coordinated by the ELIXIR hub based near Cambridge in the UK.
Specific objectives for the partners in 2017 include setting up a network of ELIXIR beacons that will let users query all beacons simultaneously; developing new features and new security measures that will encourage sharing by stakeholders with more sensitive data; and forming strategic partnerships with national data owners to enable data flow to the Beacon service. Furthermore, the technical advances that emerge from the partnership will feed back into the broader Beacon project, the partners said.
This partnership expands on an existing one that began in 2015 and has already resulted in beacons at ELIXIR nodes in Sweden, Finland, France, Switzerland, Belgium, and at the European Genome-phenome archive, a joint project of EMBL-EBI and the Center for Genomic Regulation. Another beacon will soon be launched in the Netherlands. In total, ELIXIR has 20 nodes across Europe that are coordinated by the ELIXIR hub based near Cambridge in the UK.
ELIXIR will join a network of beacons that the GA4GH has established, Project Co-lead Marc Fiume said in an interview. Fiume is also CEO and co-founder of DNAstack which developed the Beacon network, a search engine for exploring publicly available beacons. The partnership with ELIXIR will help establish a unified network for simultaneous queries across nodes in Europe and elsewhere.
Moreover, the ELIXIR beacons will be the first ones which will allow researchers to ask for some clinical information in addition to genotype data. "We've proven the traction on the public data-sharing side but we are trying now within the next year or so to make beacons more relevant in the clinical setting and to serve more sensitive information like phenotypes and for that we're developing a federated secure process through which you can log in and query beacons," Fiume said. "We want to create a sort of system for progressive disclosure of information where you might be able to discover that there is information of interest on the public level but if you want more clinically oriented information like phenotype, you might have to log in."
Researchers will still ask yes or no questions to find variants at beacon sites but in addition, with the proper authentication, they will be able to access clinical features that are associated with the patient that has the genetic mutation of interest. Participating institutions will choose what sort of identification they want requesting researchers to provide in order to access phenotype data, Fiume said. "What we are trying to do is develop the tools that allow organizations to build and participate in a sort of internet for genomics, but within that framework, they are still able to specify a level of trust and which identity they trust."
The partners are also mulling methods of supporting millions of queries as well as ways to detect potential attacks and minimize risks to patients' privacy, Fiume said. "Anytime you share genotype information, there is a risk [of] re-identification," he said. In partnership with ELIXIR, "we're trying to understand and mitigate the risks associated with those types of queries against our system."
Under the terms of the agreement with GA4GH, ELIXIR is providing funds and technical expertise to implement beacons at additional ELIXIR nodes and to implement a graduated system of authentication protocols that will govern access to sensitive phenotype data. "We have a strong commitment to develop tools and set standards for sharing genomic data and international collaboration," according to Serena Scollen, head of human genomics and translational data at ELIXIR and co-lead of the Beacon project. "Working in silos to manage data is not effective, but we can't just be thinking about Europe. We need to be aligned with and developing standards that are being set on an international scale. And that's where it fits with GA4GH."
The partnership will be supported by funds set aside within ELIXIR's budget for implementation studies — short-term projects performed by more than one ELIXIR node to address key scientific and technical issues within the organization. "The partners who will be involved in this project will be those groups that are focusing on making datasets available [but] in addition to that, we do have a part of the project where we will be looking at key datasets … and we'll be focusing on trying to use links to make those available as well," Scollen said in an interview. Examples from existing beacons include population datasets from countries like Sweden and Finland that have generated whole-genome sequence on subsets of their respective populations.
Furthermore, participating nodes have technical developers in place with different backgrounds and levels of expertise that they can contribute to the project. "Having people from diverse backgrounds and different datasets and not all from the same institute is valuable to making this really innovative," Scollen said.
Having a presence in Europe makes it easier for GA4GH to collaborate with existing precision medicine initiatives such as Genomics England or large-scale sequencing projects, according to Fiume. "With the Beacon project we have significantly lowered the time it takes to share data," he said. "I think the reason we got so much traction and so much adoption early is because its meant to be a general-purpose protocol for sharing information in a way that you could layer on top of any underlying data store whether you have vcfs or an EHR that has genomic information."
More broadly, "we are very eager to collaborate with any data-sharing initiative which is serving genotypes," he said. So far, "we have [had] lots of traction with the research community and large-scale sequencing projects but we're [also] seeing patient-initiated data sharing beacons."
One example is MyGene2, a web tool through which families with rare genetic conditions who are interested in sharing their health and genetic information can connect with other families, clinicians, and researchers. The platform includes a database for sharing health information and genetic data that is searchable by researchers and families. Fiume's group has developed an adaptor that allows users to search for genotypes in the database using beacon infrastructure.
Once they are able to implement infrastructure to securely share phenotypes with the European beacons, Fiume expects that the group will look into possibly implementing the approach in other contexts that need to share sensitive information on patients, for example within the rare disease community. "I would like to see [Beacon] become a 'Google' for genomics where you can discover where there is information of interest, and then there are some tools that allow you to refine what exactly you are searching for whether its clinic or research or in silico or patient," he said. "The technology is general purpose so we can certainly collaborate with anyone who is willing to share genotypes."