As privacy concerns grow over the increasing use of DNA analysis technologies, scientists from Harvard University and international collaborators have drafted a framework to enable population genomics research while preserving privacy. Despite the potential benefits of population-scale genomics, unresolved privacy issues — including ones around health data breaches and the use of genomic information by law enforcement — make data sharing difficult and time-consuming. To address this, the researchers describe in Nature Computational Science a system for "privacy-preserving personal genomics centered on individuals' privacy requirements and their demand for more control and transparency over their data and its use." The open-source approach combines cryptographic technologies such as homomorphic encryption and secure multi-party computation with the auditability of blockchains. The system is being deployed with an unnamed personal genomics company to test its real-world utility in alleviating privacy concerns and encouraging data sharing, its authors write.
By analyzing the single-cell transcriptomes of neuroblastomas — pediatric tumors that form in developing nerve cells, particularly around the adrenal glands — a team led by investigators from the German Cancer Research Center has uncovered new details about the cancer's cellular origins. As described in Nature Genetics this week, the researchers used single-nucleus RNA sequencing to define the cell types in adrenal glands and their lineage trajectories during various stages of embryonic and fetal development, then compared their findings to neuroblastoma cell transcriptomes. Among their findings is a transcriptional resemblance between neuroblastomas and normal fetal adrenal neuroblasts, with the differentiation state of neuroblastomas along the normal neuroblast differentiation trajectory associating with cancer prognosis.
A new computational tool for integrating multiple single-cell RNA sequencing datasets is presented in Nature Biotechnology. Developed by a team from the Albert Einstein College of Medicine, the algorithm — called Reference Principal Component Integration, or RPCI — uses the gene eigenvectors from a reference dataset to establish a global frame for integration. The scientists demonstrate RPCI with bother simulated and real-life datasets to show that it outperforms other methods "with clear advantages in preserving genuine cross-sample gene expression differences in matching cell types, such as those present in cells at distinct developmental stages or in perturbated versus control studies."