NEW YORK (GenomeWeb) — The National Human Genome Research Institute has earmarked nearly $7.3 million to fund a series of research projects over the next several years to advance its recently established Human Genome Reference Program (HGRP), which aims to build a human genome reference that better accounts for genetic diversity than the existing reference.
In early 2018, the NHGRI held a meeting of over 65 basic research, clinical, and bioinformatic scientists who determined that the current version of the human reference does not adequately represent human haplotype variation, that the existing tools to include alternative haplotype information in analyses are not well-used, and that there is an opportunity to significantly improve the human reference by developing it into a pan-genome. As a result, the agency is launching the HGRP to enable an improved human genome reference, and to foster its long-term sustainability and improvement.
To that end, the NHGRI has set aside $2.5 million in fiscal 2019 for a Human Genome Reference Center (HGRC) to oversee the activities of the HGRP, acting as its logistical and scientific coordinating center. The HGRC will also responsible for constructing and releasing new human reference sequence versions; receiving and resolving error reports; integrating and coordinating with the HGRP's various arms; fostering collaboration with external investigators; and providing community outreach and training for using the reference.
The NHGRI has also committed $3.5 million in fiscal 2019 to fund the production of roughly 350 high-quality — and ideally haplotype-resolved — human genomes to aid in the development of a genome reference that is representative of human population genetic diversity. This effort is expected to initially use existing samples that have sufficient diversity and have been consented for full data release.
The NHGRI said it expects production of high-quality genome assemblies at levels needed for the human reference, balancing costs/funds available and use of best available technologies, but noted that the appropriate metrics and ways to add diversity are still partly open scientific questions. "We anticipate that program metrics and priorities will be established through discussion within the HGRP as a whole before significant data production begins," it said. "Moreover, sequencing technologies, cost, required quality, and added value of new genomes will be assessed by the program over the course of the project, as these considerations are expected to change over time."
Lastly, the NHGRI will provide about $1.3 million in fiscal 2020 to fund two to four projects focused on developing improved representations for computing on the information contained within the diverse genome assemblies that will make up the human reference sequence.
Funded projects are expected to develop software that can demonstrate efficiency, scalability, computational speed, ease of use, and the ability to foster adoption of the reference and analysis tools developed by other HGRP contributors, the NHGRI said. Such software is also expected to adhere to a high level of open science to enable the pan-genome reference to be widely integrated in the larger community and to support outside contributions.