A Shanghai Jiao Tong University-led team describes a hierarchical, analytical pipeline for identifying non-reference sequences in pan-genome collections generated for humans or other eukaryotic species with similar genome sizes. The researchers used their "Human Pan-genome Analysis" (HUPAN) approach to 90 assembled Han Chinese genomes and 185 newly sequenced genomes from the same population, uncovering more than 29 million bases of genome sequence not found in the human reference genome. Taking a closer look at those previously unappreciated sequences, the authors tracked down 188 predicted protein-coding genes not found through typical comparisons to the human reference genome. "HUPAN is a useful tool for capturing complexity of the human genome," they write, "and the constructed pan-genome can be an important resource for a wide range of human genome-related biomedical studies, such as cancer genome analysis."
Researchers in Germany present a parallel targeted sequencing strategy for taking a quantitative look at genome sequences, transcripts, and/or single-cell datasets on a budget. The approach — known as "Barcode Assembly for Targeted Sequencing" (BART-Seq) — involves pooled, high-throughput sequencing on amplicons produced with DNA barcode panels and consistent, multiplexed sets of forward and reverse PCR primers, the authors explain. "This concept of a priori sample indexing is different from the existing transcript-targeted analysis techniques, which are generally based on pre-amplification first, and indexing of the samples using DNA barcodes afterwards." The team applied BART-Seq to genotype BRCA1/2 in cancer patients, for example, and used the same primer-barcode amplification-based approach to profile expression markers of development in thousands of human pluripotent stem cells.
A team from the Novartis Institutes for Biomedical Research in Switzerland outlines an analytical method called "Cell Subtype Identification from Upregulated gene sets" (CellSIUS) for focusing in on rare cell types in complex collections of single-cell RNA sequence data. The researchers used the algorithm to assess synthetic and real datasets, including single-cell RNA-seq profiles for cells in human stem cell-derived populations — an analysis that uncovered new and known neural cell types from rare cell lineages thought to contribute to deep-layer corticogenesis in the brain. "[T]he signature gene lists output from CellSIUS provide the means to isolate [rare cell populations] for in vitro propagation and characterization of their role in neurological disorders," the authors suggest.