NEW YORK (GenomeWeb News) – The Human Genome Sequencing Center at Baylor College of Medicine and DNAnexus today announced a collaboration aimed at the large-scale analysis of genomic data.
HGSC has adopted the DNAnexus enterprise cloud platform to power HGSC's Mercury pipeline, and it and DNAnexus worked with Amazon Web Services to process data from the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium using the Mercury pipeline.
As a result, 430 terabytes of data were generated and made available to the more than 300 researchers involved in CHARGE.
The project spans across five institutions across the globe and involves the analysis of genome sequencing data from more than 14,000 individuals, comprising 3,751 whole genomes, and 10,771 exomes. CHARGE requires about 2.4 million core-hours of computational time and 860 terabytes of storage.
At the project's peak, HGSC said it used the DNAnexus platform "to spin up more than 20,000 cores on demand" so that the CHARGE data could be run through the Mercury analysis pipeline.
"The management and analysis of genomes at the scale needed to appropriately power clinical studies requires computational infrastructure that exceeds the capacity of most institutional resources," Jeffrey Reid, assistant professor in the department of Molecular and Human Genetics at the Baylor College of Medicine, said in a statement. "Working with DNAnexus and Amazon Web Services, we were able to rapidly deploy a cloud-based solution that allows us to scale up our support to researchers at the HGSC, and make our Mercury pipeline analysis data accessible to the CHARGE Consortium, enabling what will be the largest genomic analysis project to have ever taken place in the cloud."
"Many large-scale population studies to date have been limited in scope by a lack of the necessary compute power; this is a real hindrance in realizing the full promise of genomic medicine," DNAnexus CEO Richard Daly added. "Through this collaboration with the HGSC and Amazon Web Services, 300 scientists can now perform downstream analyses on these invaluable health and aging data at a scale not previously possible."
Financial and other terms of the deal were not disclosed.