NEW YORK (GenomeWeb) – In an open-access pilot project, researchers from Baylor College of Medicine have demonstrated that cancer patients are willing and able to provide "true informed consent" for sharing their genome sequencing data.
They hope that open datasets of "real world" cases will be useful for advancing precision cancer treatment and spur public discussion about protecting patients' privacy while respecting their autonomy to share data freely to advance science.
According to researchers led by Baylor's Lauren Becnel and Richard Gibbs, the pilot project, in which seven patients agreed to openly share data from sequencing their tumor and matched normal samples, is the first of its kind in the cancer setting. These patients' genomic information — exome sequencing data from seven patients and additional whole-genome sequencing data from two patients — is available freely for research through the Texas Cancer Research Biobank (TCRB), as long as users agree to a few broad tenets, mainly that they will not try to re-identify study subjects.
"We tried in this project to respect participants’ right to privacy, while allowing them to make an informed choice about taking reasonable risks to their privacy in order to help advance research," Amy McGuire, director of Baylor's Center for Medical Ethics and Health Policy, told GenomeWeb.
Genomic research carries with it the potential for privacy breaches and re-identification of participants, which in turn, could put them at risk for genetic descrimination. As such, researchers can access individual-level genomic data through resources like dbGaP and cgHub only if they agree to strict security and data-use agreements. This reduces the risk of re-identification and discrimination for participants, but it's not fail proof. Meanwhile, restricted access to these datasets inhibits research, places a wall between patients who want to share their data and genomic researchers, and hinders use of cloud-based platforms with appropriate data control, Becnel and colleagues wrote in a Scientific Data paper describing the pilot project.
"Large-scale cancer sequencing efforts have generally been limited to specimens that meet stringent criteria," Becnel told GenomeWeb over e-mail. "These criteria help ensure the highest possible quality genomic data, but exclude many participants who had received prior treatment and as a consequence did not have high cellularity from research."
In the present pilot project, Becnel's group included samples from patients that they deemed to be representative of "real-world" cases, for example, patients who had received prior treatment and low tumor cellularity. Such cases are needed so researchers can start to figure out how to perform genomic and bioinformatic analyses in patients who don't provide "ideal" specimens.
Researchers involved in the government-backed NCI-MATCH study, which is exploring a genomically guided precision medicine hypothesis in cancer, have cited tissue quality as one of their early challenges. Within the study, currently on hold for a planned interim analysis, not many patients have yet "matched" to treatment arms for a number of reasons, including the fact that around 15 percent of the samples were inadequate for analysis in terms of sample amount and quality.
Becnel believes that open datasets like the one her group published can train the next generation of genomics researchers and bioinformaticians. "We envision that these data will be utilized to develop and advance analytical and visualization tools," she said.
For example, the researchers wrote in the Nature paper that bioinformaticians can use datasets from less than ideal samples to tune their callers to low tumor cellularity samples. "We hope that this pilot will present an opportunity for the community to address these 'real-world' cancers and ultimately translate findings that will provide benefit to a wider array of patients," Becnel added.
The pilot project also demonstrates that cancer patients "could and would provide true informed consent," the researchers wrote in their paper. They offered 194 cancer patients the chance to share their genomic data in an open-access research context. From the more than half who agreed, Becnel and colleagues chose 37 patients whom they provided additional education about the risks and benefits of data sharing and surveyed to gauge comprehension, risk tolerance, and comfort with open-access release of their data. After this, 23 patients still agreed to share their genomic data.
The researchers didn't ask participants about their motivations for open-access data sharing, "but since all of the participants had cancer, I imagine they were interested in moving the research forward," McGuire observed.
Researchers then chose seven out of the 23 patients who demonstrated that they understood the risks and benefits of data sharing and had a high tolerance for risk. Research instructors and students who want to use the data can create an account, accept the TCRB's conditions of use, and download the information without submitting a formal data-access request to data-use committees.
Users have to agree to not try to identify the seven individuals who shared their data; to not compare or link this data to private health information; and not use the data for "direct profit" or resell it. TCRB is making the data available open access for learning and research purposes. Users can publish their own research with this data, as long as researchers properly acknowledge TCRB and the Baylor College of Medicine Human Genome Sequencing Center.
Ultimately, the researchers decided to release the genomic data of only seven of the 23 cases in an effort to "protect confidentiality to the fullest possible extent," according to Becnel. For example, researchers made sure that none of the patients in the open access pilot project had rare ethnicities or tumor types defined by statistics from the Surveillance, Epidemiology, and End Results program.
McGuire, who was also an author in the Nature paper, noted that there are fewer privacy risks for patients sharing tumor genomic data than their germline genomic sequence. Still, she acknowledged that some patients who have a disease might be concerned about their data privacy and the potential for discrimination. "On the other hand, they might also be more likely to appreciate the benefits of data sharing and advancing scientific research, making them more motivated to participate," McGuire said.
With this open-access pilot project, Becnel hopes to advance public discussion on the two ethical responsibilities that genomic research must increasingly grapple with: patients' right to privacy and their autonomy in choosing when and how they share their data. These ethical discussions will be critical to the success of high-profile projects, such as the Precision Medicine Initiative, within which the NIH will build a cohort 1 million volunteers who will share their medical, environmental, and genomic data for research.
The White House recently announced that Vanderbilt University and Verily, a life sciences subsidiary of Alphabet, will conduct a pilot program exploring the best ways to engage, enroll, and retain participants in the PMI cohort. "We want to enable any person anywhere in the United States to be able to raise their hand and volunteer to participate," NIH Director Francis Collins said during a briefing last week. "This pilot approach will allow us to learn how to create durable relationships with volunteers."
Meanwhile, there are genomic testing options popping up for the private individual who doesn't want to share this information. For wealthy customers who want to lock down their data, testing firm Guardiome provides whole-genome sequencing for around $3,200, and sends the data back in a device, called Helixa. Inside this desktop device, a customer's WGS data is encrypted and only accessible by password. Any tampering or access without authorization triggers the data to self-destruct.
But even this type of a service wouldn't stop a truly motivated DNA burglar. Outside of a genomics and research context, people make tradeoffs all the time between how well their privacy is protected and the conveniences they benefit from, for example when using a credit card for online shopping, McGuire noted. "Those are decisions we make every day, and it is no different than the decision to take risks to your privacy by sharing your medical or genetic data in service to some potential social benefit," she said.
This article has been corrected to note that the paper published by Becnel et al. was in the journal Scientific Data, not Nature.