The European Commission has awarded €4.5 million ($7 million) to a consortium of 32 research organizations, universities, and companies from 13 countries to determine how to transform Europe’s biomedical data resources into a transnational “sustainable integrative bioinformatics network” for the life sciences.
The consortium is led by the European Bioinformatics Institute. The first year and a half of this project, called ELIXIR, for European Life-science Infrastructure for Biological Information, is aimed at gathering input, performing technical-feasibility studies, and doing user surveys among researchers who generate and use data, and develop tools.
The project then will try to hammer out a plan to be discussed with the member states and their grant funding agencies, EBI Director Janet Thornton told BioInform.
Much like the Oxford English Dictionary, which will remain useful as long as people speak English, so, too, will “the infrastructure for biological information … continue as long as people are interested in biology, health, and medicine,” she said.
But provisions must be made to allow that to happen, including upgrading the infrastructure itself, promoting database interoperability, and using distributed annotation technologies.
ELIXIR will not unfold with “a big bang” but rather evolve in stages. And it does not solely concern the lions of the European database world such as EBI and the Swiss Institute of Bioinformatics, said Thornton. There are internationally collaboratively maintained databases such as InterPro and IntAct, and also many small ones.
According to Thornton, of the world’s approximately 900 biomedical databases, 40 percent are in Europe, many of which are specialist databases. “They are often in an individual’s laboratory; a research group may have developed a data resource that other people begin to use, and we really need to find a way to incorporate those data resources into the larger network in a way that people can access them easily and without necessarily knowing that they were there in the first place,” she said.
ELIXIR will help link the core and the specialist resources more closely while remaining as “seamless and transparent as possible to our users,” said Thornton. Clicking through databases from link to link is possible for researchers, she said, but clickthrough gymnastics becomes an impossible exercise when a microarray delivers a list of hundreds of genes that must be analyzed with a systems perspective, requiring data to be plowed and mined that is stored in different ways with differing vocabularies.
“If you are doing large throughput bioinformatics you need computational access to large data resources,” she said.
Thornton does not tiptoe around what she sees as one of the largest challenges when it comes to database sustainability. “In Europe our core resources are not sensibly funded at the moment,” she said. EBI gets half its rolling funding through the European Molecular Biology Laboratory and half through hundreds of non-rolling external grants, she said.
“Trying to run these really important core resources with this funding model is a logistical nightmare,” said Thornton. If the resources do not obtain funding, these core resources might be endangered, having a “drastic effect on the life of biologists.”
By comparison, the National Center for Biotechnology Information in the US does not face these types of financial challenges, she said. So this new project is about securing the resources already in place as well as assuring their future.
The requirements in life sciences, particularly given second-generation sequencing technologies bring with them massive requirements for data storage and data accessibly.
“Part of ELIXIR is to provide a framework so that when new resources are needed, there is a mechanism in Europe to decide how those can be funded competitively,” she said.
“The infrastructures that enable much of the research in life sciences are still very fragmented.”
Part of ELIXIR is also about developing protocols and standards so that when new resources are funded there are guidelines as to how these databases can be structured that communicate most easily with the core resources.
On another level ELIXIR must help build communicative avenues between the life sciences and European supercomputing centers. “There are differences in perception on how this is going to develop,” said Thornton. These centers have mainly been built for the chemistry, physics, and climate-modeling communities.
Another facet of ELIXIR involves software tool and service providers. “Most of bioinformatics software is freely available in the academic domain,” she said. Some companies do exist in this space, for pathway analysis software, for example.
“I think there is space for them,” she said, but many struggle to keep their products at the “cutting-edge” of life sciences. The principle for ELIXIR is that data are available in the public domain and that academic software is freely available, she said. “But that … should help the software companies,” she added.
Harmonizing biological data so it can travel from the lab level up through the university, the national level, and all the way to EBI requires continuity, said Tommi Nyrönen, who is the ELIXIR coordinator at the Finnish supercomputing center, the publicly owned, not-for-profit firm CSC-Scientific Computing where he and his colleagues provide bioinformatics services for scientists.
“If that chain doesn’t work, the data can end up somewhere but not where a researcher, say, in Japan, can find it,” he said. “This is a huge problem,” and not one his supercomputing center can solve on its own. “That is why it is great to have ELIXIR,” he said. “In Europe there is no other player but EBI who could strive to lead this [effort].”
National funding agencies in Europe like to support specific, focused projects, such as those studying which genes are activated in specific diseases. “All this information will end up on CDs on shelves of the research scientist who produced it, and go nowhere,” said Nyrönen.
These agencies don’t ask scientists to distribute their raw data for others to perform additional analysis. There is a “huge disproportion” between the amount of money given to produce this data and the amount spent for storing and distributing it, he said, adding that the CSC could enable data sharing if the funding agencies make this a priority.
Finland relies on a single supercomputing center — which is also the country’s sole ELIXIR partner — to handle projects in physics, chemistry, linguistics, and biology. It has an XD4 Cray machine, a large Linux cluster, and a data archive.
Among Nordic countries, although Finland is “quite strong” in bioinformatics, Sweden and Norway are “ahead of Finland” with the investment in infrastructure to enhance data exchange in biomedical research, said Nyrönen. This connection has been built on a tradition of trans-national collaboration. ELIXIR could take that tradition to the next level adding more countries, he said, and help to create awareness that more IT-manpower is needed in the biomedical sciences
Johan van der Lei is the ELIXIR coordinator at Erasmus Medical Centre in the Netherlands. He also directs the ALERT consortium, a collective of databases from several countries that stores the electronic medical records of approximately 30 million European residents, and that is used to study drug safety.
As soon as an adverse drug event is picked up with the help of the database, “one of the issues that needs to be addressed is biological plausibility — that is, can we find a biological explanation that relates this particular drug, or any of its metabolites, to this event,” van der Lei wrote in an email to BioInform.
That approach requires multi-disciplinary input from doctors, epidemiologists, and also, for example, toxicology and genomics researchers who all have their own journals, databases, and nomenclature.
“The infrastructures that enable much of the research in life sciences are still very fragmented,” he wrote.
Work at the intersection of disciplines is facilitated when ways can be found to combine various sources of information in a transparent manner, said van der Lei. “I believe indeed that scientific progress will be at the intersections of disciplines. My ideal for ELIXIR would be that it moves us a bit closer to a research infrastructure that facilitates interchange between different disciplines.”
“In business terms, [the funding of ELIXIR] is money to create a business plan, to go out, speak to all of the funders and get them to buy into the venture, ultimately,” said Thornton. The goals of ELIXIR she said are “bread and butter and foundations” because the data resources “are the foundations on which biology in this century will be based.”