What we do - the gist
At Recursion, we have the benefit of being able to "ask the cells for more data" in frequent, high-throughput, rich biological experiments to an extent that sets us apart from most others doing biology and disease research.
We grow human cells, make them into models of thousands of rare diseases by breaking the genes corresponding to each disease, take pictures of them using automated microscopes, computationally extract 1000 structural features like shapes and textures from every cell, and quantify the structural differences that separate diseased from healthy cells. We then apply thousands of drugs to the cells corresponding to each disease, take pictures, and identify drugs that make the cells look healthy again. These drugs get investigated by our biologists, tested in animals, and eventually become new treatments for any of the thousands of untreated genetic diseases.
What you'll do
You'll work with our data, biology, and engineering teams to identify and answer questions in high-dimensional data space using your abilities and intuitions and our evolving data science platform. This platform is the core of our mission -- transforming drug discovery into a data science problem. We're tackling challenging problems, often with no obvious solutions, and in some cases with no right answers. But we're a group of sharp and highly-motivated scientists and engineers with diverse backgrounds and we're making rapid progress.
The high-level job description has only one item: do whatever is necessary to help us progress in identifying cures for diseases. We hire the best, and trust that they are usually in the best position to decide what to try next.
Typical work includes:
- Work directly with our biologists to understand the questions they need answered to help continue driving our platform forward.
- Perform exploratory analysis and build creative visualizations of our high-dimensional numerical data.
- Work with weekly experimental datasets on the order of 10 million rows (one per imaged human cell) by 1000 features.
- Perform analysis, learning and classification, using existing tools and building your own when appropriate, both on your own and in collaboration with other team members.
- [Develop mature analyses into web-based tools usable by our biologists.]
- Assist in the experimental design and planning process as your familiarity with our data and platform grows.
- Present your work and pick up techniques at conferences, as desired.
What you need
- High fluency and the equivalent skills of 1+ years of experience in: statistics, machine learning, coding, and answering questions in high-dimensional numerical datasets. Preferably using the Python data stack (pandas, sklearn, etc).
- Thorough grasp of fundamentals of machine learning such as cross-validation and learning curves.
- A track record of outstanding past projects, publications, or presentations.
- Code you can share. Send a link along with your email (Helpful, not required).
- Biology background is _not_ necessary. Intellectual curiosity and motivation to learn is a must, though!
There are more than 5,000 untreated rare genetic diseases, which together affect nearly ten million people in the US alone. Each of these diseases affects too few people for traditional pharmaceutical companies to approach them, so we're building a way to seek treatments for hundreds of these diseases in parallel. We aim to find treatments for 100 of them in the next 10 years.
We offer competitive compensation, health insurance, an outstanding team, challenging and worthwhile problems, and close proximity to some of the best skiing, hiking and climbing in the world.