In Science this week, a team led by Whitehead Institute researchers reports that some volunteers who donate genome sequence information can be identified using publicly available information, raising privacy concerns. By profiling short tandem repeats on the Y chromosome and mining recreational genetic genealogy databases, the investigators show that surnames can be recovered from personal genomes. The surnames, combined with metadata such as age and state of residence, can then be used to triangulate the identity of a participant. Using this approach, they were able to determine the identities of about 50 volunteers in the 1000 Genomes Project. The scientists suggest that the public shouldn't stop donating to such research efforts, but call for the establishment of "clear policies for data sharing, educating participants about the benefits and risks of genetic studies, and the legislation of proper usage of genetic information."
Indeed, Baylor College of Medicine ethicist Amy McGuire tells The New York Times that "the illusion you can fully protect privacy or make data anonymous is no longer a sustainable position."
And our sister publication BioInform has more on this study over here.
Also in Science, investigators from the Research Institute of Molecular Pathology in Austria present a new method, called STARR-seq, to directly and quantitatively assess genomic enhancer activity for millions of gene candidate from arbitrary sources of DNA, which enables screens across entire genomes. In the Drosophila genome, STARR-seq identified thousands of cell-type-specific enhancers across a broad continuum of strengths, linked differential gene expression to differences in enhancer activity, and created a genome-wide quantitative enhancer map. Notably, the method can be used in other eukaryotes, including humans.