In a move that imperils a bioinformatics staple — and raises broader questions about the long-term stability of open source informatics resources — the UK’s Medical Research Council has announced it will shut down the bioinformatics division of the Rosalind Franklin Center for Genomics Research (RFCGR), which currently hosts the EMBOSS software package.
In late March, MRC decided to close the research division and the bioinformatics division of the RFCGR (formerly called the Human Genome Mapping Project Resource Center) by July 31, 2005. The decision, which was not made public until the EMBOSS project posted a statement on its homepage a few weeks ago, affects more than 20 people in the bioinformatics group, and around 60 people in the research division. The MRC GeneService division at RFCGR, which provides reagents along with DNA and RNA services, will remain open.
The RFCGR’s bioinformatics division is best known internationally for maintaining EMBOSS (European Molecular Biology Open Software Suite) — a package of more than 100 sequence analysis applications that offers an open-source alternative to Accelrys’s GCG Wisconsin Package and serves as the backbone of numerous bioinformatics pipelines worldwide. EMBOSS does not receive any direct funding, but the RFCGR developers have maintained the package “as part of their wider remit of providing a national bioinformatics service for the UK,” said Alan Bleasby, EMBOSS coordinator and head of the proteomics applications group at the RFCGR. The dissolution of the RFCGR bioinformatics group, therefore, will effectively halve EMBOSS’s core development team.
Describing the potential impact on the project, Bleasby was almost literally at a loss for words: “I shall not use ‘decimate,’” he said, “because that word only means to reduce by one-tenth. … but it does affect development drastically.”
Rise and Fall?
MRC’s verdict adds a new wrinkle to the intertwined history of GCG and EMBOSS, which has typified the relationship between academic and commercial software development within the bioinformatics community. The roots for EMBOSS began in 1989 with a project called EGCG, or “Extended GCG.” EGCG was based on Genetics Computer Group’s GCG source code, which was publicly available at that time, and in a friendly arrangement, GCG agreed to distribute the EGCG extensions along with its own software to customers.
But when the company decided to charge for access to the GCG source code as it prepared for an acquisition by Oxford Molecular in 1997, EGCG ran into limitations regarding redistribution of the software. So Peter Rice (then at the Sanger Center), Bleasby (then at Daresbury Laboratory), and some of the original EGCG developers began to write EMBOSS from scratch; the object was to develop an open-source set of libraries and applications, initially incorporating EGCG functionality, for genome-scale analysis.
Since then, a core team of developers has supported the project despite limited funding. The Wellcome Trust funded a postdoc position in 1997, and more recently, the UK’s Biotechnology and Biological Sciences Research Council funded two postdoc positions for EMBOSS development as part of the broader Collaborative Computational Project 11, but Rice (now at EBI), Bleasby, and the other core developers have borne the brunt of the maintenance tasks on a volunteer basis.
Version 1.0 of EMBOSS was released in July of 2000 — just as Pharmacopeia (now Accelrys) was acquiring GCG via Oxford Molecular — and since then, the package has been downloaded to more than 21,000 unique Internet addresses, Bleasby said. In addition to the core developers, the package is updated frequently by a growing network of volunteers. “We get submissions all the time from far and wide,” he said.
For many in the bioinformatics community, EMBOSS has served as a poster child for bioinformatics open source development: With scant funding and a dedicated team of developers, the project has evolved into an essential component of the bioinformatics toolkit for many organizations. But the threat posed by the closure of the RFCGR group highlights an often-overlooked aspect of such projects: Even distributed, volunteer-based projects require a home base — and a stable core development team — to survive.
“This is coming at a particularly irritating point, because EMBOSS has been going from strength to strength,” Bleasby said. Projects in the pipeline include a new indexing system, interactive graphics, an EMBOSS ontology, and the provision of EMBOSS web services and workflows as part of the myGrid and Taverna projects. The closure of the RFCGR group will affect several of these directly and the others indirectly, Bleasby said.
The MRC’s decision to eliminate the bioinformatics group has already begun to take its toll on the EMBOSS team, Bleasby added. One of the project’s developers has taken a post at the Sanger Institute, where he has been granted permission to continue to maintain the Jemboss GUI. “It’s not an ideal solution, but it’s as good as we could hope for,” Bleasby said. “Heck, in this position, if somebody offers you a job, you’re not going to refuse.”
A Year to Plan
As the RFCGR team is dissolved, it’s likely that other developers will be able to support EMBOSS under similar arrangements, but Bleasby said that he and his colleagues are exploring options that would keep as many of the core developers together as possible. One possibility, he said, would be for the MRC, BBSRC, or industry to fund several “virtual desks” at EBI or another research center for developers to work on the project full-time.
Bleasby said he’s also collecting letters of support from EMBOSS users, and hasn’t lost hope that MRC and BBSRC will “have a revelation that their institutes and their scientists do use the software.”
While there is a full year to plan for the closure of the RFCGR group, “a year is not a long time in trying to arrange this sort of thing, so a start has to be made now,” Bleasby said. “You can imagine as people are finding jobs, they will be leaving.”
The elimination of the RFCGR bioinformatics group follows on the MRC’s decision to cut funding for the pioneering Staden package last year [BioInform 05-05-03], raising questions about the future of bioinformatics funding in the UK.
In an official statement released May 14, MRC said, “Competition for MRC funding is extremely high and, overall, the Center’s future plans were deemed uncompetitive.” MRC said it would “actively consider” which of the center’s services and resources are “essential,” and which scientific programs of work might be funded elsewhere.
“The center has made an important contribution to medical science with its role in the Human Genome Project and it’s never an easy decision to close a research unit,” said Colin Blakemore, MRC CEO, in the statement. “But science is continually evolving and we must ensure that our limited resources are used to focus on delivering the very best science in a manner which reflects best value for money.”
The MRC did not respond to BioInform’s requests for comment on the council’s future bioinformatics funding outlook.
BBSRC, meanwhile, “has no plans to reduce its commitment to bioinformatics research,” according to a council spokesman, and “is keen to expand its funding.” Andrew McLaughlin, media officer at BBSRC, told BioInform via e-mail that the council has not ruled out direct funding of EMBOSS, “but any application will be fully peer reviewed and will have to successfully compete for BBSRC funds against all other grant applications if it is to be supported.”