Temporary Data Engineer

Stanford University, School of Medicine - Biomedical Data Science Initiative (BDSI)
Job Location
Stanford, CA
Job Description

This is a 6-month temporary opportunity with the potential to convert to a full-time, continuing role.

This position located in the Office of the Dean, Stanford Medicine, and supports one of the Dean’s key initiatives: Biomedical Data Science. The initiative brings the power of “Big Data” to Medical research and Patient Care delivery. We are looking for someone who can create the data platform for Data Science-based medical research. Stanford Medicine has access to data on genetics, immunomics, molecular studies, clinical care delivery, and much more. All of these play a part in human health,
whether at an individual level, or at population level – or anywhere in the middle.

Working with the university network infrastructure architecture and policies, this engineer will come up with models for storing and accessing this diverse data, both structured (e.g., quantitative) and unstructured (e.g., textual, image), so it can be efficiently traversed, queried and managed. With the help of system administrators, s/he will bring up the data stores and optimize them; and build any data transfers, via ETL/equivalent. Beyond that, this role will create or install an ever-richer set of data analysis and modeling tools, suitable for the needs of medical research. Noting that the goal is to make diverse and large-scale data available to
researchers smoothly and fast, this engineer will curate and optimize the access to this medical data.

We are looking for someone who can work comfortably in a faculty-driven environment, with diverse research interests and timetables. S/he would help convert research-directed data requests into real data access recommendations. We require someone who takes ownership of the problem, works flexibly and well with others, and enjoys an academic, somewhat
unstructured environment.


Required Qualifications:

- An undergraduate degree in Computer Science or equivalent, and at least 5 years of
- Developing and maintaining data systems, in C, C++ or Java.
- Deep knowledge of ETL: proprietary or open source.
- A strong understanding of both relational and document based storage systems.
- Proficiency with data modeling, queries and data access both via SQL and otherwise
- Expertise in designing, developing, testing and deploying data applications.
- Familiarity with analytic tools: commercial (such as Microstrategy, SAS) or open source
(such as BIRT, R)
- Preference for/willingness to try open source software over proprietary, commercial
- Ability to define and solve logical problems for highly technical applications.
- A deep interest in applying emergent technologies to new fields.

Desired Qualifications: An advanced degree in Computer Science or equivalent, and
experience in the following:
- Both structured and unstructured data at a large scale
- Python programming
- Distributed computing
- Hadoop environment and MapReduce programming
- The MySQL family of relational databases
- Database administration
- Statistical computing using R
- Experience working in an interdisciplinary environment.

Stanford University is an equal employment opportunity and affirmative action employer and is committed to recruiting and hiring qualified women, minorities, protected veterans and individuals with disabilities.

The successful incumbent must already have authorization to work in the United States; we are unable to provide sponsorship at this time.

How to Apply

Interested candidates may submit their cover letter and resume directly to Jan Dong at janinad@stanford.edu.

About Our Organization

About the Biomedical Data Science Initiative at Stanford:

From population health to personalized medicine, from modeling atoms to modeling the atmosphere, we are interested in using large scale computing and data analysis to improve human health. We bring Stanford's legendary technology innovation to the frontier of biomedicine. Whether we are detailing the human immune system, finding new applications for drugs, deciphering autism in children, or monitoring pandemic strains with the help of social networks - we are shaping human health for the 21st century.

For more information, please visit: http://med.stanford.edu/bdsi/index.html

NIH's Michael Lauer looks at the number of grants, their amount, and funding success rates at the agency for last year.

At Nature, Johns Hopkins' Gundula Bosch describes her graduate program that aims to get doctoral students thinking about the big picture.

Patricia Fara writes about childcare funding, and women in science and science history at NPR.

National Institute of Environmental Health Sciences researchers have visualized the career paths of former postdocs.