Participants in human genomics studies are typically guaranteed anonymity, a practice intended to safeguard against use of an individual's data by outside parties like insurers or employers.
Just how anonymous, though, is this anonymous data?
This week in Nature, Erica Check Hayden profiles Whitehead Institute researcher Yaniv Erlich, a computational biologist who is applying lessons learned from his days as a hacker to investigate the security of genomic study data. And this data, it turns out, isn't all that secure.
For instance, Check Hayden reports, in a paper published in Science this January, Erlich's lab demonstrated they could identify participants in genetic research studies by cross-referencing their genetic data with publicly available information like age and place of residence.
Using a software program he and an undergraduate student had developed for profiling short tandem repeats, Erlich identified nearly 50 supposedly anonymous participants from the 1000 Genomes project.
As Check Hayden observes, this wasn't the first time someone had demonstrated that it was possible to identify study participants based on their data. Those past efforts, though, had relied on other sources of research data.
"Erlich's study," she writes, "upped the stakes, because it showed that it was possible to identify people from their genetic data by linking not to other sources of research data, but to information freely available on the Internet."
How, exactly, the community can and will address this issue isn't entirely clear, but regardless, Check Hayden says, Erlich has helped move the problem into the spotlight.