In a database issue paper published online in advance this week, members of the FlyBase Consortium discuss the basics of navigating their Web resource. "We review the FlyBase Web site with novice and less-experienced users … in mind and point out recent developments stemming from the availability of genome-wide data from the modENCODE project," the authors write.
Researchers in Germany this week show that "the allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process." More specifically, the researchers use human exome data to show that "the variance of allele frequencies at heterozygous loci is higher than expected by a simple binomial distribution." Because of this, mutation callers that rely on binomial distributed priors "are less sensitive for heterozygous variants that deviate strongly from the expected mean frequency," the team writes.
Using data extrapolated from a high-resolution alternative splicing RT-PCR panel, the Medical University of Vienna's Maria Kalyna and her colleagues show that "alternative splicing and nonsense-mediated decay modulate expression of important regulatory genes in Arabidopsis." Kalyna et al. show that, together, alternative splicing/nonsense-mediated decay regulate the abundance of transcripts of several transcription factors, RNA processing factors, and stress response genes.
New York University's Wei-Hsiang Lin and Edo Kussell show that simple sequence repeat type "strongly influences indel mutation rates" in prokaryotic coding regions. Using codon-shuffling algorithms, Lin and Kussell found that "coding sequences suppress repeats in the middle of proteins, and enrich repeats near termini," they write, adding, though, that in some cases they observed "over-enrichment of SSRs [simple sequence repeats] near protein N-termini significantly beyond expectation based on structural constraints."