There is widespread bias in the differential expression profiling field amid unreliability of statistical methods used to analyze high-throughput sequencing data, according to a report appearing in PLOS Biology this week. In recent years concerns have been raised over the quality of experimental science, particularly in terms of reproducibility, replicability, and statistical power to find true effects. Aiming to explore these issues as they relate to expression profiling by high-throughput sequencing, scientists from the University of Tartu analyzed tens of thousands of relevant datasets submitted to the National Center for Biotechnology Information's Gene Expression Omnibus between 2008 and 2020. "Our goal was to study real-world statistical inferences made by working scientists," the study's authors write. "Thus, we study how experimental design choices and analytic decisions of scientists affect the quality of their statistical inferences." The results of the study collectively suggest that the field is largely built on analyzing low-power experiments that are unlikely to identify actual effects while presenting an unknowable number of false discoveries as statistically significant. Although the researchers do not see an apparent single fix for the problem, they say that the development of cheaper ways of conducting experiment could make well-powered experiments practically feasible, which in turn could lead to the development of analytic workflows that work.
Field-Wide Bias in Differential Expression by High-Throughput Sequencing Found
Mar 03, 2023