Data Scientist - Bioinformatics

Lawrence Berkeley National Laboratory
Job Location
Walnut Creek, CA 94598
Job Description

Data Scientist-Bioinformatics - 81970
Organization:  EB-Environ Genomics & Systems Bio

Berkeley Lab is Bringing Science Solutions to the World, and YOU can be a part of it!

In the world of science, Lawrence Berkeley National Laboratory (Berkeley Lab) is synonymous with "excellence." That's why we hire the best - whether in research, finance or other operations. This is a great opportunity to bring your top-notch skills to bear in support of world-class scientific research that addresses national and global challenges!

Position Summary:
Within the Berkeley Lab’s Environmental Genomics & Systems Biology Division, the Monarch Initiative ( is seeking a passionate and motivated individual to join its cross-functional and internationally collaborative project.  We develop and apply cutting-edge approaches for analyzing cross-species genotype-phenotype data to enable biomedical and disease discovery in an open-source framework. We are looking for someone to help us build the world’s largest phenotypic knowledge base, and to develop innovative ways to access and use this knowledge.

Specific Responsibilities:
Software Developer 3
• Integrate new data sets of interest into our graph-based knowledge store
• Probing the data with interesting biological questions to test the correctness of the data model
• Implement new visualizations over the data
• Wrangling large datasets, including collaborating on innovative methods for storing and processing data
• Develop new solutions to intelligently automate the accumulation of phenotypic knowledge and datasets
• Implement substantial, known or novel, computational methods to improve the analysis capabilities and output of the group
• Troubleshoot systems and data problems and select appropriate solutions
• Participate in formal and informal design and code reviews
• Present progress at internal group meetings

In addition to the above responsibilities, the Software Developer 4 will:
• Identify, model, and integrate new data sets of interest
• Discover novel biological insights
• Designing and implement new visualizations over the data
• Troubleshoot diverse and challenging systems and data problems and exercise independent judgment to select appropriate solutions
• Present progress at research conferences


Required Qualifications:
Software Developer 3
• Bachelor’s degree or equivalent plus 4 years of related experience in computer science, genetics, bioinformatics or related field
• Experience in wrangling and using biomedical data
• Determine methods and procedures on new or existing software solutions
• Quantitative training with basic understanding of probability and statistics
• Ability to be self-driven and work well together in a cross-functional and interdisciplinary team
• Proficiency in Unix environments
• Programming experience in:
   - Scripting languages such as Python
   - Analysis using statistical and machine learning tools, such as R, Weka, and scikit-learn
   - Experience with various databases and their query languages, such as SQL, SPARQL or Neo4j/Cypher
• Experience with communicating results and concepts to a diverse audience including geneticists, engineers, laboratory scientists
• Travel once per year
• Attention to detail

In addition to the above qualifications, the Software Developer 4 will have:
• MS or PhD with equivalent experience in computer science, genetics, bioinformatics, or related field
• Familiarity with development or management of pipelines in production environments
• Experience with semantic web, ontologies, their application, and analysis
• Experience with high performance computing environments and/or cloud-based environments
• Experience with genome and/or phenome-wide association studies or human population genetics / statistical genetics including imputation and haplotyping
• Proficient with Java or JVM languages such as Scala or Groovy
• Represent the Laboratory at conferences 2-3 times per year

This is a fantastic opportunity to help Monarch grow it’s knowledge store to new levels.  If you like to poke at data and discover how animal models can translate into biomedical insight, then this is the ideal opportunity for you.

The posting shall remain open until the position is filled.

Notes: This is a 1 year, term appointment with the possibility of renewal and with the possibility of conversion to career. Classification will depend upon the applicant's level of skills, knowledge, and abilities.

This position requires completion of a background check.

How to Apply

Apply directly online at and follow the on-line instructions to complete the application process.

About Our Organization

Berkeley Lab addresses the world’s most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab’s scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the U.S. Department of Energy’s Office of Science.

Equal Employment Opportunity: Berkeley Lab is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, age, or protected veteran status. Berkeley Lab is in compliance with the Pay Transparency Nondiscrimination Provision under 41 CFR 60-1.4 ( Click here ( to view the poster and supplement: "Equal Employment Opportunity is the Law."

