Web services like the Global Alliance for Genomics & Health's Beacon Project aim to enable data sharing among researchers, but Stanford University's Suyash Shringarpure and Carlos Bustamante report in the American Journal of Human Genetics that they may also open participants up to re-identification.
Such services only provide researchers information on whether certain alleles are present in their cohort, but despite that effort to protect participants' privacy, Shringarpure and Bustamante say that with a certain number of queries to a database, they could determine whether a given individual was in that database. "The beacon system is an elegant solution that allows investigators to 'ping' collections of genomes," Bustamante adds in a statement. "This allows people studying the same rare disease to find one another to collaborate."
They calculated that if they had an individual's genome, they could locate that person within a beacon network. For instance, if that network housed genomic data on 1,000 people, they could identify that person or his or her relatives in some 5,000 queries. Similarly, they could determine whether a person was in a beacon of 65 European individuals from the 1,000 Genomes Project based on 250 SNPs.
Shringarpure and Bustamante also suggest ways for such networks to strengthen their security by disallowing anonymous researchers and requiring approval of researchers, merging datasets, and limiting access to smaller genomic regions.
"We welcome the paper and look forward to ongoing interactions with the authors and others to ensure beacons provide maximum value while respecting privacy," Peter Goodhand, executive director of the Global Alliance for Genomics and Health, says.