Genomics Data Scientist. Expert in Machine and Deep ML in 'omics'

Job Location
Remote, Telework
Blacksburg, VA 24060
United States
Job Description

Genomics Data Scientist, Expert in Natural Language Processing and ML. Virology computational pipeline project.

Number of Consultant Positions: 1

Location: -Telework option

We are looking for a Data Scientist specializing in genomics to join our group. This a consulting position, 1-15 hours per week to start, and moving to more or fewer hours later as needed. Flexible hours, but MUST be an expert in applying NLP/ML to genomics or proteomics data (NGS).

The position is part advisory to an existing team, and programming as needed. It will provide the candidate(s) with an opportunity to apply their computational and mathematical skills to develop innovative methods for analyzing complex virology data to identify novel mutant strains, targets and biomarkers associated with pathogen-associated diseases.

Project Overview:

The successful candidate(s) will have researched and utilized many of the available NLP/ML tools on genomics data sets, and be able to advise and contribute in a "hit-the-ground-running" capacity to assist the existing team in brainstorming and troubleshooting our novel ML designs to NGS data in our existing pipeline.

The development environment is in Linux, and geared toward large-capacity Linux servers.

Please only apply if you are a Genomics Data Scientist, an EXPERT in NLP and other ML methods applied to genomics (NGS, RNAseq, etc, or protein sequence) data!

Job Types: Part-time, Contract


General MANDATORY for the position(s):

  • Languages: Python, SQL
  • Operating system: Linux
  • Big Data: Next generation sequencing data, genomics data (NGS, RNAseq, other). Proteomics experience + ML can be considered
  • Science field: Genomics-based; preferably viruses or polyploid genomes
  • Expert in: Natural Language Processing (NLP),
  • Machine Learning, Deep Learning
  • Variational Autoencoding, or data decompression methods
How to Apply

Contact the company directly: [email protected] or [email protected] or [email protected]

May apply on the GATACA website:

About Our Organization

GATACA, LLC is a new-frontier biotech company--located in SW Virginia near Virginia Tech-- that develops advanced bioinformatics tools for virologists to identify mutating variants of disease-causing viruses from sequence data. We are a nimble, dynamic team designing and developing algorithms and a pipeline to support the computational side of virology research and clinical investigations of pathogenic viruses. Working alongside a prestigious group of world-renowned scientists, we have tailored our pioneering bioinformatics pipeline for virologists. Once deployed, the pipeline will be adapted to other disease indications (e.g., cancer), applications (e.g., clinical trial screening), multi-omics data, functional genomics and structural bioinformatics to drive new target evaluation, and clinical research.


