Cambridge, United Kingdom

We are seeking a software engineer to join the non-vertebrate genomics group at the European Bioinformatics Institute (EMBL-EBI), to work on the WormBase project. EMBL-EBI is located on the Wellcome Trust Genome Campus near Cambridge in the UK.

Parasitic worms (helminths) are responsible for more than a billion human infections globally and have a devastating impact on livestock and agriculture. EMBL-EBI is a member of the WormBase consortium (, an international project to curate, store and display scientific data relating to the model helminth C.elegans and other worms. As international efforts to sequence the genomes of parasitic helminths accelerate, we are collaborating with the Parasite Genomics Group at the Wellcome Trust Sanger Institute on the provision of a new BBSRC-funded resource, WormBase-ParaSite, to analyse, store and present information on these genomes.

The primary role of the post-holder will be to develop (i) a data warehouse data that is optimised for rapid-yet-flexible querying of large volumes of data, and (ii) a web-portal with a querying interface that will allow users to access, summarise and download relevant portions of the total data set. The development of this resource will combine custom-written software with the use of standard technologies already deployed within EMBL-EBI, such as BioMart, the Ensembl genomic data management infrastructure, and the Drupal content management system.

The WormBase team at EMBL-EBI are part of the wider non-vertebrate genomics group which produces the Ensembl Genomes ( resource, as well as contributing significantly to various global collaborative projects such as VectorBase ( and PomBase ( The post-holder will work closely with other members of the wider group and participate in the development of general tools for invertebrate genomics.

Candidates should have a post-graduate qualification in bioinformatics or a related discipline, and at least 2 years professional programming experience. A solid working knowledge of at least one programming language (e.g. Perl, Java) and relational databases is essential. Background knowledge of molecular biology and/or genomics is desirable and dealing with next-generation sequencing data would be an advantage.

