Skip to main content
Premium Trial:

Request an Annual Quote

Data Sharing and Consent


When human embryonic stem cell lines began to be reviewed last year to determine their eligibility for use by federally funded investigators, the National Institutes of Health considered the informed consent obtained from donors. The agency decided "to honor any restrictive language in the informed consent" and, thus, some of the newly approved lines may only be used to study pancreatic formation and diabetes.

As more and more genomic data is deposited into databases or is otherwise shared between researchers, similar ethical concerns are coming up in the genomics field. While sharing data is generally positive — particularly as large sample sizes are needed for many studies, such as genome-wide association studies — Amy McGuire, an associate professor at Baylor College of Medicine's Center for Medical Ethics and Health Policy, says that ethical issues can arise. "The primary concern, I think, is that you have appropriate informed consent from the research participants for their data to be shared. That becomes challenging when you have existing samples or data sets collected without the anticipation of data sharing," she says.

Naturally, researchers won't be replicating previous work exactly, so their particular secondary use of the data may not be expressly mentioned in the original informed consent. Researchers could go back and re-consent the original research participants for their secondary use — and McGuire says there are a few instances in which this has occurred — but that's difficult to do. There are a few options, then, to try to maintain the participants' privacy. One is to remove all personal identifiers from the data in the database so that no one could tell the source of the data. But not having that personal information, which could include phenotypic or clinical information, limits the research that can be performed down the line. Another option is to code the data so that only the original PI has access to personal information. "On the one hand, that does some to protect privacy. On the other hand, there are concerns because DNA is itself a unique identifier and so just doing that doesn't guarantee privacy," McGuire says. "There are still risks associated with it, albeit small and uncertain risks. … Some may argue that you still need informed consent so that people are aware of those risks."

If you are planning to submit your data to a database, you can take preventive steps to try to avoid these complications. First, McGuire says, check that your informed consent in order; be sure it doesn't contain language that could limiting data sharing, as many older consent forms do. Then, think about what you do want to include to let people know what will happen to their data.

Another approach to adopt from the get-go is to collaborate with your study participants. In this model, researchers and participants have more of a partnership and stay in contact with one another over the Internet. These participants, McGuire says, can then continually update who may access their data and receive updates about what is being done with their information. "I think those are newer, more creative possible solutions that people are beginning to explore," she says.

On the downstream users' end, there are few steps that can be taken to ensure data privacy and security. "Downstream users of stored data become fairly disconnected from the fact that this information came from a human being and gives you information about the potential health and other characteristics of particular person," McGuire says. Just keep those people in mind, she says, by limiting access to their data and keeping it safe on a computer that can't be lost.

The Scan

Lung Cancer Response to Checkpoint Inhibitors Reflected in Circulating Tumor DNA

In non-small cell lung cancer patients, researchers find in JCO Precision Oncology that survival benefits after immune checkpoint blockade coincide with a dip in ctDNA levels.

Study Reviews Family, Provider Responses to Rapid Whole-Genome Sequencing Follow-up

Investigators identified in the European Journal of Human Genetics variable follow-up practices after rapid whole-genome sequencing.

BMI-Related Variants Show Age-Related Stability in UK Biobank Participants

Researchers followed body mass index variant stability with genomic structural equation modeling and genome-wide association studies of 40- to 72-year olds in PLOS Genetics.

Genome Sequences Reveal Range Mutations in Induced Pluripotent Stem Cells

Researchers in Nature Genetics detect somatic mutation variation across iPSCs generated from blood or skin fibroblast cell sources, along with selection for BCOR gene mutations.