Skip to main content
Premium Trial:

Request an Annual Quote

ENCODE Effort to Use DNAnexus Data-Sharing, Analysis Platform

NEW YORK (GenomeWeb) – DNAnexus said today that the Data Coordination Center for the National Institutes of Health-funded Encyclopedia of DNA Elements (ENCODE) project has selected its cloud-based bioinformatics platform to handle data analysis and sharing for the third phase of the effort.

The DCC, which is at Stanford University, serves as the central hub for handling and processing raw sequencing data collected from the 14 biomedical institutes across North America that are involved in the ENCODE project, which aims to comprehensively catalog all the features of the human genome and provide a foundation for studying the genomic basis of human biology and disease.

Researchers in the DCC chose DNAnexus' platform because it supports collaboration and provides a scalable environment for processing thousands of datasets, the company said. The system was also selected because its supports transparency, reproducibility, and provenance for ENCODE pipelines which will help ensure clear and consistent results.

The researchers have already implemented and optimized ENCODE's bioinformatics pipelines to run on the cloud and have begun using them to analyze data from the project, according to DNAnexus. It's expected their analysis will require 10 million core-hours of compute and will generate nearly 1 petabyte of raw data over the next 18 months on the DNAnexus platform.

The partnership will provide "secure and immediate access and use of ENCODE's results," DNAnexus CEO Richard Daly said in a statement. "We believe the availability of the consortium's gold-standard analysis pipelines and ENCODE data on a single integrated platform will accelerate genomic medicine."

The ENCODE pipelines are available in a public project on the DNAnexus platform and also on the GitHub repository.

The Scan

Study Points to Tuberculosis Protection by Gaucher Disease Mutation

A mutation linked to Gaucher disease in the Ashkenazi Jewish population appears to boost Mycobacterium tuberculosis resistance in a zebrafish model of the lysosomal storage condition, a new PNAS study finds.

SpliceVault Portal Provides Look at RNA Splicing Changes Linked to Genetic Variants

The portal, described in Nature Genetics, houses variant-related messenger RNA splicing insights drawn from RNA sequencing data in nearly 335,700 samples — a set known as the 300K-RNA resource.

Automated Sequencing Pipeline Appears to Allow Rapid SARS-CoV-2 Lineage Detection in Nevada Study

Researchers in the Journal of Molecular Diagnostics describe and assess a Clear Labs Dx automated workflow, sequencing, and bioinformatic analysis method for quickly identifying SARS-CoV-2 lineages.

UK Team Presents Genetic, Epigenetic Sequencing Method

Using enzymatic DNA preparation steps, researchers in Nature Biotechnology develop a strategy for sequencing DNA, along with 5-methylcytosine and 5-hydroxymethylcytosine, on existing sequencers.