Text miner (Data Scientist)

Organization
EMBL-EBI
Job Location
EMBL-EBI Wellcome Trust Genome Campus
Hinxton Cambridge
CB10 1SD UK
United Kingdom
Benefits

EMBL is an inclusive, equal opportunity employer offering attractive conditions and benefits appropriate to an international research organisation. The remuneration package comprises a competitive salary, a comprehensive pension scheme and health insurance, educational and other family related benefits where applicable, as well as financial support for relocation and installation.

We have an informal culture, international working environment and excellent professional development opportunities but one of the really amazing things about us is the concentration of technical and scientific expertise – something you probably won’t find anywhere else.

If you’ve ever visited the campus you’ll have experienced first-hand our friendly, collegial and supportive atmosphere, set in the beautiful Cambridgeshire countryside

Job Description

We are seeking to recruit a senior text and data mining specialist to join the Literature Services Team at the European Bioinformatics Institute (EMBL-EBI) located on the Wellcome Trust Genome Campus near Cambridge in the UK.

Manual curation by expert biologists is a critical approach to developing quality public data resources in the life sciences. As the team at the EBI responsible for running Europe PMC (the database of research articles incorporating both PubMed and PMC; http://europepmc.org), our goal is to integrate the open research literature with public data resources, supporting better search technologies, browse experiences, and database curation workflows. Based on Europe PMC content, we: (1) enable the text mining community to extract named entities, relationships, or events extracted from the content and highlight these on article; (2) text mine content daily to extract entities such as genes/proteins, organism names, chemicals, Gene Ontology terms, diseases, and data citations; (3) participate in a variety of projects that address supporting curator workflows through text and data mining.
Specific job responsibilities include:

  • Lead the development new text mining services;
  • Improve existing core text mining services;
  • Data analysis and evaluation of extraction results;
  • Iterative improvement of solutions, with key stakeholders;
  • Participation in project meetings and conferences to present work;
  • Produce publications on developments.

 

Requirements

The successful candidate must be able to demonstrate most of the following:

  • MSc or PhD in a related field (computer science, information science, bioinformatics)
  • Proven experience of a range of methodologies such as NLP, pattern mining, dictionary-based techniques, machine learning, deep learning, ranking systems.
  • Experience of text-mining applied to curation of biological data resources, or similar, in an academic, industrial or publishing setting;
  • Technical ability e.g. Perl, Java, R, XML parsing;
  • Flexible approach and ability to take on new skills;
  • Self-starter and able to manage multiple projects;

Team player and good communicator

How to Apply

To apply please submit a covering letter and CV, with two referees, through our online system.

About Our Organization

EMBL is an inclusive, equal opportunity employer offering attractive conditions and benefits appropriate to an international research organisation. The remuneration package comprises a competitive salary, a comprehensive pension scheme and health insurance, educational and other family related benefits where applicable, as well as financial support for relocation and installation.

We have an informal culture, international working environment and excellent professional development opportunities but one of the really amazing things about us is the concentration of technical and scientific expertise – something you probably won’t find anywhere else.

If you’ve ever visited the campus you’ll have experienced first-hand our friendly, collegial and supportive atmosphere, set in the beautiful Cambridgeshire countryside. Our staff also enjoy excellent sports facilities including a gym, a free shuttle bus, an on-site nursery, cafés and restaurant and a library.

An analysis appearing in PeerJ finds that social media mentions of a paper may lead to increased citations.

NIH's Michael Lauer looks at the number of grants, their amount, and funding success rates at the agency for last year.

At Nature, Johns Hopkins' Gundula Bosch describes her graduate program that aims to get doctoral students thinking about the big picture.

Patricia Fara writes about childcare funding, and women in science and science history at NPR.