NEW YORK (GenomeWeb News) – Following a full year of community consultation, the National Institutes of Health last week issued its final policy on sharing data obtained from NIH-funded genome-wide association studies.
The policy is designed to facilitate the research community's access to genotype-phenotype datasets from GWAS while ensuring the privacy of research participants, NIH said in a notice released on Friday.
“The potential for public benefit to be achieved through sharing GWAS data is significant. However, genotype and phenotype information generated about individuals, such as data related to the presence or risk of developing particular diseases or conditions and information regarding paternity or ancestry, may be sensitive,” NIH said in the notice. “Therefore, protecting the privacy of the research participants and the confidentiality of their data is critically important. Risks to individuals, groups, or communities should be balanced carefully with potential benefits of the knowledge to be gained through GWAS.”
Among other requirements, the policy recommends that researchers deposit GWAS datasets in the Database of Genotypes and Phenotypes, or dbGaP, at the National Center for Biotechnology Information, which includes “multiple tiers” of data security to ensure privacy.
“Although the NIH envisions that access to all NIH-supported GWAS datasets will be possible through this repository, it does not intend the repository to become the exclusive point of data submission for these data, nor does it intend the central database to delimit the structures or tools that may be appropriate for other similar databases,” NIH said in the notice.
Investigators are also expected to submit descriptive information about their studies for inclusion in an open access portion of the data repository. All data submitted to the data repository will be de-identified and coded using a random, unique code, NIH said.
NIH said it also “strongly encourages” the submission of curated and coded phenotype, exposure, genotype, and pedigree data, as appropriate, to dbGaP “as soon as quality control procedures have been completed at the local institution.”
NIH said that although the “basic descriptive and aggregate summary information” submitted to the data repository will be publicly available, an NIH Data Access Committee will control and monitor access to the genotype and phenotype datasets and automated calculations such as genotype-phenotype associations.
Investigators and institutions will need to submit a data access request to the DAC in order to access GWAS datasets, and must promise to use the data only for the approved research, follow appropriate data security protections, and not attempt to identify individual participants, among other requirements.
NIH initially announced its plans to update its data-sharing policies for GWAS data last May and issued a formal call for public comments on a proposed policy in August. Following the comment period, the agency hosted a meeting in December to address questions related to the proposed policy.
The final policy was developed “in response to the feedback received and further internal development of the issues,” NIH said in Friday’s notice.
Further information about the data-sharing policy is available at NHGRI’s website and Office of Population Genomics page.