Skip to main content
Premium Trial:

Request an Annual Quote

Six Degrees of Protein Separation


With a background in astrophysics, computational biologist Imran Shah isn’t afraid of big numbers or hairy computer problems. That’s good for a person who wants to figure out all the proteins in the universe.

Shah — a native of Pakistan and the first recipient of a PhD in computational biology at George Mason University — joined the University of Colorado faculty last year and launched a project to explore new techniques for discovering proteins and their functions using microbes such as E. coli and H. pylori as model organisms.

Shah draws on available databases of enzymes and biochemical reactions to construct a network that maps out metabolic pathways and possible connections between them. “If we’ve identified half the proteins and want to figure out what the other half is doing, this can help us fill in the gaps,” he says. Shah relies on a cluster of eight Linux-based computers (and government supercomputers when needed) to run machine-learning algorithms, written in LISP, that pore over thousands of chemical reactions and generate rules about how one substance can transform into another. The computer can generate hypotheses that might provide helpful clues for a wet-lab biologist about potential pathways or links between compounds, Shah says.

Some scientists are intrigued by this strategy, while others think Shah might be tackling more than he and his computers can handle. Shah has demonstrated the concept by retracing about 10 known metabolic pathways. By the end of the year, he hopes to find hints of new functions for known proteins. If he’s lucky, the discovery of new proteins will soon follow.

— Steve Nadis

The Scan

Researchers Compare WGS, Exome Sequencing-Based Mendelian Disease Diagnosis

Investigators find a diagnostic edge for whole-genome sequencing, while highlighting the cost advantages and improving diagnostic rate of exome sequencing in EJHG.

Researchers Retrace Key Mutations in Reassorted H1N1 Swine Flu Virus With Avian-Like Features

Mutations in the acidic polymerase-coding gene boost the pathogenicity and transmissibility of Eurasian avian-like H1N1 swine influenza viruses, a PNAS paper finds.

Genome Sequences Reveal Evolutionary History of South America's Canids

An analysis in PNAS of South American canid species' genomes offers a look at their evolutionary history, as well as their relationships and adaptations.

Lung Cancer Response to Checkpoint Inhibitors Reflected in Circulating Tumor DNA

In non-small cell lung cancer patients, researchers find in JCO Precision Oncology that survival benefits after immune checkpoint blockade coincide with a dip in ctDNA levels.