NEW YORK (GenomeWeb) – The Global Alliance for Genomics and Health (GA4GH) recently named seven genomic data initiatives as new driver projects for 2019, expanding the coverage of the standards-setting organization to countries in Asia and Africa.
The alliance now has 22 driver projects — initiatives that have agreed to work with and promote GA4GH — representing people from more than 100 countries, according to chair Ewan Birney, director of the European Molecular Biology Laboratory's European Bioinformatics Institute.
The alliance has also deepened its ongoing partnership with ELIXIR, while continuing to roll out new tools and standards for genomic data discovery, analysis, and interpretation.
GA4GH's new driver projects include Genome Medical Alliance Japan (GEM Japan), Human Heredity and Health in Africa (H3Africa), the Swiss Personalized Health Network (SPHN), the European Joint Program on Rare Diseases (EJP RD), EUCANCan, EpiShare, and the Autism Sharing Initiative. According to the organization, the projects were selected based on global representation, scientific merit, and ability to contribute resources to development efforts.
"These driver projects will bring some new perspectives and some new priorities and new engineers," noted Birney. "Each had to commit basically two engineers" to GA4GH, he noted.
While Birney praised the addition of all of the projects, he specifically highlighted the involvement of GEM Japan, H3Africa, and SPHN.
Japan's Agency for Medical Research & Development (AMED) manages GEM Japan, which aims to enable sharing of genomic and phenotypic data from Japanese research efforts with other communities. The project has established a nationwide alliance of universities, institutes, and hospitals committed to advancing genome-based medicine. According to the organization, it aims to share the allele frequencies as well as disease variation of the Japanese population and to hold specific workshops to encourage the uptake of GA4GH tools by Japanese researchers.
"This is a project that goes from clinical healthcare through to population genetics and genomics across Japan," noted Birney, adding that "it is the first major involvement of an Asian country" in GA4GH.
By joining GA4GH as a driver project, H3Africa meantime expands the reach of the alliance into over a dozen African countries. The project's aim is to create a network of laboratories across the continent for studying disease susceptibility and drug responses in African populations. This includes making genomics available to African scientists, including next-generation sequencing and relevant bioinformatics capabilities. Scientists at the computational biology group at the University of Cape Town are managing much of H3Africa's bioinformatics activities.
Birney noted that the addition of H3Africa provides GA4GH with pan-African coverage as it builds its tools and standards. "That is going to be terribly important for research because of the amazingly higher diversity in sub-Saharan Africa," he said.
Nicola Mulder, head of the computational biology group at the University of Cape Town, agreed with Birney's assessment.
"H3Africa is a bit different to the other driver projects in that it is not a single project but a consortium of projects related to genomics," Mulder said. As such, H3Africa has overcome challenges in doing large-scale genomics in Africa, and can share with the alliance its experiences in working with limited resources, overcoming ethical and consent challenges across multiple countries, and building infrastructure for genomic data management and analysis, she said.
"African genomes are more diverse and currently under-represented in public repositories, therefore H3Africa has much to offer in increasing diversity in genetics and link to diseases," Mulder said.
A third driver project that Birney chose to highlight is SPHN. This effort, managed via the Swiss Academy of Medical Sciences and the Swiss Institute of Bioinformatics, aims to develop a nationally coordinated infrastructure to enable genomic and health-related data sharing for research purposes across Switzerland.
"This is another country-level engagement," Birney said, "offering a good balance of genomics and digital health." Switzerland commenced the project in 2017, with an initial budget of around CHF 40 million ($41 million) to support IT and clinical data interoperability through 2020.
The other four driver projects added earlier this month include EJP RD, which is focused on developing tools, projects, and programs across Europe to support rare disease research; EUCANCan, a federated network of infrastructures in Canada, Germany, the Netherlands, France, and Spain for the analysis, management, and sharing of cancer genomic data; the Autism Sharing Initiative, a federated, global network for sharing genomics and clinical data related to autism; and EpiShare, a joint project between the International Human Epigenome Consortium and the Encyclopedia of DNA Elements that is adapting GA4GH resources to support epigenomic data sharing and analysis.
The new driver projects are the first major expansion for GA4GH since it launched its five-year strategic plan, called GA4GH Connect, in 2017, with 13 initial driver projects. Expanding its list of driver projects has long been an aim of the alliance. Birney said the addition of the driver projects has given the alliance a "real mandate" to continue its efforts.
ELIXIR and DUO
GA4GH has a long-standing partnership with ELIXIR, the European infrastructure for life science data, which joins bioinformatics resources across 23 European countries. ELIXIR also takes part in GA4GH as a driver project through ELIXIR Beacons, an initiative that aims to establish data-access beacons at multiple sites that allow researchers to query European genomic datasets as well as to develop protocols for securely sharing phenotype data.
GA4GH and ELIXIR recently agreed to a strategic partnership that will deepen collaboration between the organizations across all of GA4GH's work streams. Specifically, the organizations will aim to build capacity in the areas of cloud computing and authorization and authentication infrastructure, they said.
"GA4GH sees ELIXIR as our European framework for coordinating a global standard setting effort,"said Birney. "So it's a global effort with Europeans leading the way." Though he noted that implementing standards across multiple countries is "always a headache" in Europe, there is also a "funny benefit" in that Europeans often have to solve transnational issues early on.
"Europeans quickly realize you have to solve the transnational problem, or else it doesn't work," noted Birney. "It's a bit like the GSM standard in mobile phones."
ELIXIR will serve as a conduit for new GA4GH tools and standards, the latest of which is its Data Use Ontology (DUO). The alliance developed DUO to streamline the process by which researchers gain access to data.
Typically, institutions rely on informed consent forms that describe the secondary-use restrictions and conditions on their datasets. This means that each data access request must be manually evaluated against the data use letter that specifies how the dataset can be used. Because of this, data access committees can take between two and six weeks to respond to a request.
GA4GH's DUO standard was created to overcome this challenge. It provides a shared understanding of the meaning of data use categories, enabling data stewards across different resources to tag their datasets with common restrictions on how those data can be used. DUO is also distributed as a machine-readable file that encodes both how the data can be used and how a researcher intends to use the data.
The new standard provides a framework for automatically granting researchers access to multiple datasets based on their credentials, according to the organization, providing the matching between data use restrictions and intended research use, in addition to researcher authentication.
"We are releasing about one to two standards every few months, and DUO, the latest one, allows people to describe how a dataset can be used," said Birney. "Is it academic research only? Is it commercial? Is it only in a particular disease? Rather than having to read a piece of English, Finnish, or French text to understand what's going on, you can tag things with this ontology."
According to Birney, DUO should allow faster processing of whether researchers can get access. Though the tool still relies on human supervision, Birney suggested that in the future the data access process could become even more automated.
"In a future fantasy world, there may be some low-risk access mechanisms where everything is negotiated electronically," Birney said. "I think we will need a graduated risk concept in that, but at the moment there is an email and human process and we want to make it into an automated, web-centric process with some human input," he said. "Probably in the future in low-risk scenarios we'll take out the human, but in high-risk scenarios you will always need human input."
GA4GH previously released four products at its sixth plenary meeting in Basel, Switzerland, in October. These included the Beacon application programming interface, a variant search protocol; the Workflow Execution Service, which lets researchers run genomic informatics tools and workflows on data in various environments; Htsget, which allows users to stream data without having to copy and transfer large files; and Refget, which helps people retrieve reference sequences.
The alliance will hold its next plenary meeting in Boston in October, Birney said.