Skip to main content
Premium Trial:

Request an Annual Quote

Genome Biology Papers on Regulatory Element, Single-Cell Integration, Sequence Simulation Tools

Researchers at the Gladstone Institute and elsewhere introduce an algorithm called CellWalker, designed to assess regulatory element features based on a combination of single-cell ATAC-seq (scATAC-seq) data for profiling open chromatin regions, single-cell RNA sequence data, and bulk sequence data. "We present CellWalker, a generalizable network model that improves the resolution of cell populations in scATAC-seq data, determines cell label similarity, and generates cell type-specific labels for bulk data by integrating information from scRNA-seq and a variety of bulk data," the team writes. When the authors applied CellWalker to scRNA-seq and scATAC-seq data from developing human brain samples, they tracked down potential cell type-specific regulatory elements, including putative regulatory elements and cell types related to genes implicated in autism spectrum disorder, developmental delay, or other neurological traits and conditions.

A Peking University-led team outlines the "integration of multiple single-cell datasets by adversarial paired-style transfer networks" (iMAP) algorithm, aimed at integrating multiple single-cell RNA-seq datasets in a deep learning framework for dialing down so-called batch effects that interfere with the ability to interpret authentic biological variation. Using available single-cell RNA-seq data generated for more than 50,000 individual cells with Smart-seq2 and 10x Genomics approaches, for example, the researchers profiled tumor-infiltrating immune cells in samples from 18 individuals with colorectal cancer, identifying previously unappreciated immune cell interactions. The iMAP method "may be easily extended to tackle other types of single-cell measurements," they note. "We expect this work to be further improved to suit the multi-dimensional nature of the new single cell data."

Finally, investigators at the University of Zurich and the SIB Swiss Institute of Bioinformatics share a strategy for doing more realistic simulations of high-throughput, short-read Illumina sequence data. The team reasoned that simulations "are a critical part of method comparisons, but for standard Illumina sequencing of genomic DNA, they are often overs-simplified, which leads to optimistic results for most tools." In an effort to come up with more authentic sequence simulations, the authors developed a tool known as ReSeq that takes systematic sequence errors into account through training on large datasets, which they applied alongside 11 available datasets. "We show that ReSeq outperforms all competitors in terms of delivering a realistic simulation," they write, "and therefore lays the methodological groundwork for accurate benchmarking of genomics tools."

The Scan

Nucleotide Base Detected on Near-Earth Asteroid

Among other intriguing compounds, researchers find the nucleotide uracil, a component of RNA sequences, in samples collected from the near-Earth asteroid Ryugu, as they report in Nature Communications.

Clinical Trial Participants, Investigators Point to Importance of Clinical Trial Results Reporting in Canadian Study

Public reporting on clinical trial results is crucial, according to qualitative interviews with clinical trial participants, investigators, and organizers from three provinces appearing in BMJ Open.

Old Order Amish Analysis Highlights Autozygosity, Potential Ties to Blood Measures

Researchers in BMC Genomics see larger and more frequent runs-of-homozygosity in Old Order Amish participants, though only regional autozygosity coincided with two blood-based measures.

Suicidal Ideation-Linked Loci Identified Using Million Veteran Program Data

Researchers in PLOS Genetics identify risk variants within and across ancestry groups with a genome-wide association study involving veterans with or without a history of suicidal ideation.