By now, most biologists working with microarrays have made statistical methods like t-tests and p-values a part of their everyday lives — albeit grudgingly. So when biologists began badgering John Storey, a biostatistician at the University of Washington, for a new statistical tool, he knew they really must have needed it.
“Biologists were asking, ‘How do I assess significance?’” said Storey. “They know that the p-value doesn’t tell them what they want to know.”
Storey soon realized that the p-value, which describes significance in terms of the false positive rate, wasn’t helping biologists determine which genes to study in large, genome-wide data sets, such as those delivered by microarray experiments. What his biologist colleagues were really looking for, Storey said, was the false discovery rate — a measure that judges the significance of the actual set of genes selected to study.
For example, a false positive rate of 5 percent means that around 5 percent of “uninteresting,” or null, features out of an entire study will be mistakenly called significant, but a false discovery rate of 5 percent means that 5 percent of only those features called significant are in error.
Biologists commonly use p-value cutoffs to determine their gene lists in microarray experiments, but Storey cautioned that this provides little — if any — information about the significance of the genes actually selected. He recommends they instead use the “q-value” — “basically just a user-friendly measure based on the false discovery rate, just as the p-value is based on the false positive rate” — for that task, and developed some software to make the process a little less painful for biologists.
In a recent paper in the Proceedings of the Natural Academy of Sciences [PNAS 2003 100 (16): 9440-9445], Storey and co-author Robert Tibshirani introduced the software, appropriately called Q-Value, and suggested ways in which it could be applied in several published microarray studies to alleviate the difficulties of interpreting p-value thresholds.
Biologists tend to use an iterative process of arbitrary p-value cutoffs — say 0.05 — combined with biological knowledge, Story said. If a researcher expects to see a particular gene show up in the selected set, the initial p-value threshold is shifted until that gene appears, but the researcher doesn’t know what impact that adjustment has on the significance of the genes selected. The q-value “justifies what [biologists] were going to do anyway,” by adding a statistical measure to interpret the significance of the selected genes, he said.
The software, a set of R functions that is freely available at http://faculty.washington.edu/~jstorey/qvalue/index.html, estimates q-values for a given list of p-values and generates a series of graphs to help the user decide the significance of various cut-off points. For each q-value threshold, it indicates how many significant results to expect; and for each number of significant results, how many false positives to expect.
Storey said that prior to the PNAS paper, over 300 users had already downloaded Q-Value based on word of mouth alone. He is currently working on a more user-friendly point-and-click version of the software.
The concept of the false discovery rate was first proposed in 1995, Storey said, but the idea has only recently been extended to work on a genome-wide scale with tens of thousands of features. The work proves that bioinformatics is not limited to borrowing established statistical methods, but can actually contribute a few of its own. “Genome-wide studies have really inspired us to take a different perspective on some of these ideas about statistical significance,” said Storey. “The field really has motivated some new statistical ideas.”