The genome of the nautilus, the only surviving externally shelled cephalopod from the Paleozoic era, is reported in this week's Nature Ecology & Evolution, providing insights into aspects of the animal's evolution such as the pinhole eye and biomineralization. In the study, a team led by scientists from the Chinese Academy of Sciences sequenced the genome of Nautilus pompilius, uncovering a compact, minimalist genome with few encoding genes and slow evolutionary rates in both noncoding and coding regions. Coevolutionary gene losses and gene family contractions appear to have driven the evolution of the nautilus' pinhole eye, suggesting that it degenerated from a more complex organ. Meanwhile, unique and new protein-coding genes were found to contribute to the production of aragonite crystals, a major component of the nautilus shell. "The nautilus genome constitutes a valuable resource for reconstructing the evolutionary scenarios and genomic innovations that shape the extant cephalopods," the authors write.
A new computational tool for simultaneous deep generative modeling and clustering of single-cell genomic data is described in Nature Machine Intelligence this week. While technologies such as single-cell ATAC-seq (scATAC-seq) have enabled the large-scale profiling of the chromatin accessibility landscape at the single-cell level, computational analyses are hampered by issues including high sparsity and high dimensionality. To overcome such limitations, scientists from Tsinghua University and Stanford University developed scDEC, a method for analyzing scATAC-seq data by simultaneously learning the deep embedding and clustering of the cells in an unsupervised manner. They demonstrate scDEC's superiority to other approaches in a series of experiments and discuss several downstream applications of scDEC in scATAC-seq analysis, including trajectory inference, donor effect removal, and latent feature interpretation. The investigators also show its applicability to multi-modal single-cell analysis using a real data example.
To help manage data on the rapidly growing number of SARS-CoV-2 variant genomes, a University of California, Santa Cruz, team has developed a new computational resource for rapidly incorporating viral genome isolates into a global phylogenetic tree. The unprecedented and ongoing accumulation of SARS-CoV-2 genome sequences is ushering in a new era of genomic contact tracing, but massive amount of genome sequence data is also pushing phylogenetic analysis frameworks to their limits, the researchers write. To address this, they developed a software package called UShER — short for ultrafast sample placement on existing trees — that immediately incorporates SARS-CoV-2 genome isolates into a global phylogenetic tree. The scientists write that, compared to its closest counterpart, UShER is more than 3,000 times faster, orders of magnitude more memory efficient, and enables real-time genomic contact tracing. It is freely available to the research community through the UCSC SARS-CoV-2 Genome Browser, enabling rapid cross-referencing of information in new virus sequences with an ever-expanding array of molecular and structural biology data, they note.