Skip to main content
Premium Trial:

Request an Annual Quote

NIH to Pump up to $96M into New Big Data Centers

NEW YORK (GenomeWeb News) – The National Institutes of Health said on Monday that it will use up to $96 million over four years to fund several new centers that will be at the forefront of its ambition to fire up the nation’s big data capabilities aimed at biomedical research.

NIH said it has plans to award up to $24 million per year for four years to fund between six and eight Big Data to Knowledge (BD2K) Centers of Excellence, which will seek to help the US research culture to handle and use the “increasingly large and complex data sets” that is churning out of the biomedical science community.

This is the first round of funding to be announced under the BD2K program, but there will be more announced in the coming months, NIH said. BD2K was unveiled in December and was followed up by an information-seeking initiative from the National Human Genome Research Institute, which wanted to hear from voices in the research community about obstructions and opportunities related to big data.

"BD2K aims to enable a quantum leap in the ability of the biomedical research enterprise to maximize the value of the growing volume and complexity of biomedical data," NHGRI Director Eric Green said in a statement.

Green, who also is serving as acting director for data science said these centers will be “a key component to the overall initiative.”

The centers will have a strong interdisciplinary component, and will seek to bring in investigators who have experience with data science to work with biomedical researchers.

"The goal is to help researchers translate data into knowledge that will advance discoveries and improve health, while reducing costs and redundancy,” NIH Director Francis Collins said.

There are several major challenges that NIH wants these centers to address. One problem is that scientists need a resource they can turn to that will let them know what kinds of datasets and software tools are available, and where to find and how to use them.

Investigators also need to find straightforward means for releasing their metadata in standard formats, and ways to obtain and analyze data sets from others.

Standardization is particularly important, NIH said, because it enables interoperability, data sharing, and better management and analytical tools. The centers also will seek to develop better practices and policies for sharing data and software, and conduct research into new ways to organize, manage, process, and analyze large biomedical datasets.

Another core goal for the program is for the centers to train new investigators so they develop skills to engage with quantitative science areas such as computational biology, biomedical informatics, biostatistics, and other areas.

Investigators at the BD2K centers also will work to develop new software, tools, and methods for working with big data, NIH said.

The Scan

Study Tracks Off-Target Gene Edits Linked to Epigenetic Features

Using machine learning, researchers characterize in BMC Genomics the potential off-target effects of 19 computed or experimentally determined epigenetic features during CRISPR-Cas9 editing.

Coronary Artery Disease Risk Loci, Candidate Genes Identified in GWAS Meta-Analysis

A GWAS in Nature Genetics of nearly 1.4 million coronary artery disease cases and controls focused in on more than 200 candidate causal genes, including the cell motility-related myosin gene MYO9B.

Multiple Sclerosis Contributors Found in Proteome-Wide Association Study

With a combination of genome-wide association and brain proteome data, researchers in the Annals of Clinical and Translational Neurology tracked down dozens of potential multiple sclerosis risk proteins.

Quality Improvement Study Compares Molecular Tumor Boards, Central Consensus Recommendations

With 50 simulated cancer cases, researchers in JAMA Network Open compared molecular tumor board recommendations with central consensus plans at a dozen centers in Japan.