Skip to main content
Premium Trial:

Request an Annual Quote

At Beyond Genome Conference, a Call for Bioinformatics to Move Beyond Pretty Pictures


Atul Butte is annoyed with the “dummy mentality” in bioinformatics — especially in the microarray analysis area. “You can tell what algorithm [someone used] by the colors of the figures in their paper,” he quipped during the first day of the bioinformatics and genome research track of the Beyond Genome conference in San Diego last week.

Butte, an endocrinologist at Children’s Hospital, Boston, devoted his talk to bringing bioinformatics out of this intellectual rut, and his sentiments seem to have been heard loud and clear by some of the other speakers: The recurring theme was the development of more sophisticated algorithms that can extract biological meaning from data — in the case of pharma, biological meaning with drug discovery relevance — rather than just making pretty pictures.

Butte focused on the construction of relevance networks, networks that show the weakness or strength of a relationship between genes. He noted that the reams of new information being constantly added to GenBank and other databases make these networks extremely dynamic. In other words, he said, “You’re never done with microarray analysis.” But he noted that microarray data has its limits. “Not all pathways will be reverse-engineered by microarray analysis.”

Philip Xiang, director of bioinformatics at Roche Molecular systems, discussed his own approach to making gene expression analysis more sophisticated. He described how he combines information on gene expression and gene ontology trees. Xiang said he has also written a program that calculates average pairwise separation among any two genes.

Similarly, Jeffrey Sachs of Merck Research Labs said in his talk that he is “elucidating biological pathways by integrating gene annotations and gene expression data.” Sachs has developed an approach wherein he uses 10 key variables in the gene annotation as nodes of comparison for 5,000 genes under study. He looks at how the expression predicts annotation, and how expression and annotation predict other variables, including the effect of a compound on the genes. “We were thrilled that we were able to completely blindly make a prediction about a compound that people familiar with the datasets didn’t know about,” he said.

Yixin Wang at Johnson & Johnson’s molecular diagnostics unit described the way his group grappled with the issue of how to treat numeric vs. qualitative descriptors in comparing microarray data. They developed a new algorithm, borrowing the fuzzy logic method from field engineering, to be able to navigate between quantitative and qualitative data to identify significant expression patterns in the data and explore transcriptional regulatory networks, he said. This particular application of fuzzy logic involved converting the gene expression value on the gene chip to a category — low, medium, or high — and grouping genes with similar values into related “triplets.” By comparing the predicted relationship between members of a triplet to known relationships, they could test out the robustness of the algorithm, he said. Already, the group has used this approach to extract an eight-gene signature that predicts survival time in colorectal cancer with around 80 percent sensitivity and specificity — an improvement over existing methods, he said.

For Brian Moldover, from Aventis, the gold mining came not in gene expression analysis, but in searching the genome for splice variants in pharmaceutically relevant protein families. “The interesting genes are heavily patented,” he explained. “However, new variants with specific functions are still patentable.” Furthermore, he noted, 15 percent of heritable diseases involve mutations that involve splicing. Aventis is finding new splice variants two ways: computationally, and using deep sequencing and deep cloning. The company uses Celera’s genome browser, but NCBI’s sequence, and has found two targets from this work, including two splice variants of hRasGRP4 that are involved in asthma, and another one that is “a target for a multibillion dollar therapeutic agent on the market now, and is very well studied,” he said.


Filed under

The Scan

Octopus Brain Complexity Linked to MicroRNA Expansions

Investigators saw microRNA gene expansions coinciding with complex brains when they analyzed certain cephalopod transcriptomes, as they report in Science Advances.

Study Tracks Outcomes in Children Born to Zika Virus-Infected Mothers

By following pregnancy outcomes for women with RT-PCR-confirmed Zika virus infections, researchers saw in Lancet Regional Health congenital abnormalities in roughly one-third of live-born children.

Team Presents Benchmark Study of RNA Classification Tools

With more than 135 transcriptomic datasets, researchers tested two dozen coding and non-coding RNA classification tools, establishing a set of potentially misclassified transcripts, as they report in Nucleic Acids Research.

Breast Cancer Risk Related to Pathogenic BRCA1 Mutation May Be Modified by Repeats

Several variable number tandem repeats appear to impact breast cancer risk and age at diagnosis in almost 350 individuals carrying a risky Ashkenazi Jewish BRCA1 founder mutation.