We are seeking a Data Scientist with deep expertise in Oncology, who will combine his or her knowledge of cancer-related data with technology, informatics, big data and analytics skills to help assemble a data platform that provides insights into the vast amounts of real world cancer data.
You will be responsible for all activities necessary to access, mine and categorize structured and unstructured cancer related data and for presentation of that data in the Company’s big data platform. You will be engaged in all data related activities, including data acquisition, design and implementation of data processing tools, data modeling, data architecture, data mining and statistical analysis to ensure the Company’s products access, organize and deliver accurate and meaningful information to researchers, clinicians and patients.
You must be comfortable digging into databases, health records, data architecture, data processing tools with software engineering colleagues to ensure that data is correctly captured, managed and delivered. You must be highly motivated, comfortable in a fast-paced entrepreneurial environment, with experience designing and developing highly dimensional data intensive products, strong product leadership and team working skills and an aptitude for new technologies.
- Align the Company’s business and product strategies with the required underlying data assets
- Support the development and expansion of the Company’s oncology data assets to promote research, clinical and consumer use
- Lead the definition, development, implementation and standardization of cancer data-related asset.
- Provide expertise to fetch, process, cleanse, verify and QA raw data from various sources
- Design data collection processes, data quality programs and analytic tools to optimize creation and delivery of clinical oncology content
- Lead efforts to update, standardize and centralize disease factors such as cancer subtype, stage, therapy, and diagnostic factors
- Select features, build, and optimize classifiers, using machine learning techniques or big data tools
- Data mining, statistical analysis and visualization using state-of-the-art methods
- Create automated anomaly detections systems and constant performance tracking
- Collaborate with customers, product stakeholders and engineering to gather and document data-related requirements for accessing, cleaning, categorizing, organizing and mining data
- Contribute expertise in medical informatics to support the utilization of the CIDT in research and clinical settings
- Assist with the development, maintenance and adherence to policies, SOPs and data management plans related to data acquisition, use, security and compliance