A University of California, San Francisco, team has identified a range of diseases and drugs associated the expression of ACE2, the receptor SARS-CoV-2 relies on to enter cells. In their new Genome Biology paper, the UCSF team created a semi-automated framework called GENe Expression Variance Analysis, or GENEVA, which they used to sift through 286,650 publicly available RNA-seq samples to identify drugs, diseases, and more that affect ACE2 expression. Through this, they found that cardiomyopathy, treatment with RAD140 and itraconazole, and HNF1A overexpression all affect ACE2 expression. By searching through electronic health records of COVID-19 patients, the team further found that patients with cardiomyopathy had a higher risk of mortality than patients with other cardiovascular conditions. The researchers note that while further study is needed, "our result identifies … cardiomyopathy patients as a high-risk group that needs extra protection and care."
Researchers from the Chinese Academy of Forestry have generated a chromosome-scale reference genome for the oil-producing Camellia shrubs from alongside transcriptome sequencing data from 221 cultivars. As they report in Genome Biology, the researchers generated a 2.95-gigabase genome that included a number of repetitive elements and harbored a high level of heterozygosity. Through a genome-wide association study they conducted, the researchers additionally identified genes likely to be involved in oil production, which they said could help understand the genetic and genomic underpinnings of domestication and breeding programs. Further, they write that "[t]he linkage map and the precise variations of candidate genes can contribute to applications of the molecular marker-assisted breeding and genomic selection."
Researchers from Leiden University in the Netherlands have developed a new measure to tease out clusters from within single-cell RNA-sequencing data. Their approach, dubbed phiclust (ϕclust), is derived from random matrix theory and relies on the ϕ angle between vectors that represent the noise-free signal and the measured, noisy signal. As they report in Genome Biology, the researchers applied ϕclust to single-cell RNA-seq data from bone marrow mononuclear cells and a fetal human kidney dataset, which divvied them up into clusters, including ones representing known cell subtypes, but also ones that could not be identified previously from that data. "We hope that quantitative measures of clusterability, such as phiclust, can play an important role in making single-cell RNA-seq analysis more reproducible and robust," the researchers write.