Researchers at the University of Illinois at Chicago and Washington University describe an online database called OncoDB, designed for analyzing large cancer datasets to detect gene expression shifts and viral infections with potential ties to cancer, such as human papillomavirus. The current version of the resource contains data for more than 10,000 tumors and matched normal samples profiled for the Cancer Genome Atlas project, particularly RNA sequence data, DNA methylation profiles, and corresponding clinical information, the team notes, along with related regulatory insights and normal sample data from the GTEx study. "By mining TCGA RNA-seq data, we have identified six major oncoviruses across cancer types and further correlated viral infection to changes in host gene expression and clinical outcomes," the authors write, noting that "results are interactively presented in OncoDB with a flexible web interface to search for data related to RNA expression, DNA methylation, viral infection, and clinical features of the cancer patients."
A team at the Huazhong University of Science and Technology, Fudan University, and other centers in China presents a curated database focused on microbes associated with different parts of the human body and their potential consequences for human health. The mBodyMap currently spans 22 sites in the human body, the investigators say, and contains 16S ribosomal RNA gene amplicon sequence and/or metagenomic sequence data generated for more than 63,100 samples from prior studies, including microbes suspected of coinciding with dozens of human diseases. "mBodyMap organizes collected samples based on their association with human diseases and body sites to enable cross-dataset integration and comparison," they report, adding that the database is also home to "pre-computed abundances and prevalence of 6,247 species … stratified by body sites and diseases."
Finally, researchers in France report on a database and web server dubbed Genomicus, a comparative genomics site initially released more than a decade ago. The site has expanded considerably since 2010, incorporating data for more than 1,000 extant genomes and some 621 reconstructions representing ancestral eukaryotes, they note. It also includes a growing collection of search, data visualization, and analytical tools. Genomicus "is still unique for its ability to compare hundreds of extant genomes and is the first tool to provide access to inferred ancestral gene orders for five different eukaryote kingdoms," the authors write, adding that the database "is now part of the catalogue of bioinformatic tools labelled by ELIXIR."