By Julia Karow
This article was originally published Oct. 21.
The National Institutes of Health plans to update its data-sharing policies for sequence and related genomic data obtained with "advanced sequencing technology", and to encourage investigators and institutional review boards "to consider the potential for broad sharing of sequence and related genomic data" when developing informed consent protocols for studies that involve human sequence data, according to a recent notice from the National Human Genome Research Institute.
According to the document, "very large" sequence data sets that allow for new biological insights "are not only valuable for addressing the questions that the experiments were designed to ask, but also have added scientific value when combined with other large data sets."
Thus, the NIH has concluded that "the full value of sequence-based genomic data can best be realized by making the sequence, as well as other genomic and phenotype datasets derived from large-scale studies, available as broadly as possible to a wide range of scientific investigators," for example by depositing them in centralized databases such as the GenBank Short Read Archive or the Database of Genotypes and Phenotypes.
NIH "considers broad data access to be particularly important for sequence and related genomic data because of the significant resources involved in generating such data … the analytical and computational challenges involved in interpreting such large datasets, and the powerful opportunities that will be provided by the ability to make comparisons across multiple studies."
As many sequencing studies to be conducted over the next few years will include human clinical and phenotype information, "safeguarding the interests of research participants will be an essential component of any sequence data sharing policy," according to the notice, and NIH recognizes that sharing data from studies with human participants "must be consistent with the informed consent provided by the individual participants."
NIH said it expects to build its new sequence data-sharing policy on previously developed policies in areas with similar issues, in particular the NIH genome-wide association studies data sharing policy.
Issues currently under consideration for the new sequence data policy are: criteria for determining whether a project falls under the new policy, including data utility, uniqueness, quality, and participant protection; types of data and annotations that should be released; timing of data release; mechanisms for making data available to third parties; and costs of implementing policies.
NIH said it anticipates the policy will be developed over the next several months. "At an appropriate time before the policy is implemented, the NIH will publish additional details on the policy plans," according to the notice.