NEW YORK (GenomeWeb) – Recently, researchers from McGill University and other institutions published a paper in the European Journal of Human Genetics in which they discussed the legal and ethical issues that surround genomic data control; data security, confidentiality, and transfer; and accountability in the cloud. They also made a series of recommendations to guide members of the biomedical and clinical research community who are considering using the infrastructure in their projects.
The paper assesses the legal and ethical risks of trusting data to these systems and suggests mitigating measures that researchers could take into account as they consider contracts with cloud service providers (CSP). The recommendations were based on an assessment of the terms of service agreements from multiple major cloud service providers and are intended to provide "a useful starting point for researchers to consider when negotiating legal arrangements to store genomic data in the cloud," the researchers wrote. The researchers also provide at least one set of metrics for evaluating different cloud options en route to determining which platform meshes best with particular institutions' needs, Edward Dove, socio-legal scholar at McGill's Center for Genomics and Policy (CGP) and a co-author on the paper, told BioInform.
The paper contributes to the ongoing discussions in the community about using the cloud to host and analyze the massive quantities of genomic and clinical data churned out by large-scale projects, such as the International Cancer Genome Consortium, as well as from individual institutions and centers that now use sequencing technologies in their projects. Generating sequence data is now much easier and cheaper than it used to be, but analyzing and storing the information in a fast and cost-effective manner is still a somewhat expensive proposition complete with its own challenges.
"Simply buying more servers for a local research site is no longer an optimal or even feasible solution," the researchers note in the EJHG paper. As such, many in the community "are increasingly turning to cloud computing both as a solution to integrate data from genomics, systems biology, and biomedical data mining and as an approach to mine and analyze data to solve biomedical problems."
The paper lists the various cloud computing options available in the market but focuses exclusively on considerations for publicly available commercial cloud infrastructure from major vendors such as Amazon and Google. More specifically, the researchers focus on the terms of service agreements that accompany systems that operate under an infrastructure-as-a-service model — services offered here include raw computing resources and storage options. The recommendations they make are based on reviewing documents provided by Amazon, Google, and Microsoft in particular.
The measures listed in the paper attend to issues such as unauthorized access, infrastructure failure, data loss, and differing national data access and use regulations. Dove, who is also coordinator of the Global Alliance for Genomics and Health's regulatory and ethics working group, said that he and colleagues at CGP are working on a simple checklist that they believe will help researchers navigate the legal and ethical issues associated with using the cloud. They plan to make it available to the community by next year.
For example, when it comes to issues associated with data control, Dove et al. recommend that researchers keep tabs on amendments that companies make to their terms-of-service documents "with a reasonable period of time [allowed] for response and acceptance." Researchers should also have a mechanism for retrieving their data from the cloud after a contract ends and ensure that cloud providers do not retain access to the data, they said. Moreover, institutions should have sound "data encryption capabilities and good management infrastructure for control" over data that they choose to store in the cloud.
Considerations for ensuring data security and confidentiality include verifying which data elements will be stored in the cloud, "including whether [these] constitute sensitive personal data or personal health information" and limiting access only to researchers that have been vetted and approved by data access committees. Researchers should also make sure that selected CSPs have undergone independent audit of the appropriate security standards and that their audit certifications are current for the duration of their contract. Other recommendations include being aware of the chain of responsibility for preserving data confidentiality and integrity, and requesting "full indemnity for liability related to privacy and security."