NEW YORK (GenomeWeb) – Researchers led by Marc Vidal at the Dana-Farber Cancer Institute have systematically mapped nearly 14,000 human protein-protein interactions.
As the researchers reported in Cell today, this interactome map is about 30 percent larger than previous attempts and covers a broader landscape of interactions that indicates that the human interactome network may be more vast than previously anticipated.
"We're finally reaching a serious level of coverage and quality with maps like this," Vidal told GenomeWeb. "This is probably the first example for direct protein-protein interactions, binary protein-protein interactions."
Advances in genomics and sequencing have enabled scientists to uncover scores of genotypic variations, but as Vidal noted, the lines connecting the dots of many of those variants to phenotype and function remain to be drawn.
"We know a lot about phenotypes, we know … about genotypes, but there's this thing in between," he said. "Genes make proteins and non-coding RNAs that interact in all kinds of complex ways and form interactome networks and complex systems, the properties of which we are still discovering little by little.
"The way we view human disease is to say any time you have a variant at the DNA level, it's fine, but what really matters is the perturbations that this variant actually causes to the cellular systems downstream," he added.
To begin to map the human interactome, Vidal and his colleagues examined the pairwise interactions of proteins encoded by some 13,000 genes. They performed yeast-two-hybrid assays to detect pairs, and only pairs that tested positive in three of the four attempts and ones that had confirmed identities were deemed to be interacting pairs. They further validated these interactions using MAPPIT, wNAPPA, and protein-fragment complementation assays conducted in Chinese hamster ovary cells.
The resulting binary interaction map — dubbed HI-II-14 — includes 13,944 interactions among 4,303 proteins, the researchers reported.
To examine the biological significance of their new interactome map, Vidal and his colleagues compared it to a set of 11,045 high-quality protein pairs obtained from the literature. Both datasets were enriched for similar Gene Ontology terms, had an enrichment of binary interactions between proteins belonging to a common complex or expressed in the same cell type, and had more protein-protein interactions involving kinases and their substrates than a random network. This, they said, indicates that their new map reveals functional relationships at a level comparable to the literature-based interaction maps.
But, previous examinations of protein-protein interactions were often limited to small regions of the overall protein-protein interaction space, Vidal said, as researchers tended to focus on their own proteins of interest.
A more systematic approach, such as the one he and his colleagues employed here, is better able to give a wider view of the overall interactome space.
They reported that the protein pairs included in the HI-II-14 dataset are distributed more homogenously across the interactome space, as compared to literature-curated interactions, even when accounting for differences in gene expression levels.
About half of the proteome is known to participate in the interactome network, and they suggested that as their systematic exploration has led to an expansion of the interactome landscape, the entire proteome eventually could be found within the interactome network.
"Where we now have a good handle on what's going on is that we can definitely validate and demonstrate the biophysical relevance of these interactions," Vidal said.
But, this is a map, he noted, from which biology can be dissected out.
In their paper, Vidal and his colleagues drew upon the interactions their map contained to examine the relationships between cancer-linked proteins.
He likened the lists of cancer-related genes to the phone book and the interactome map they generated to Facebook. "What we decided to do was to ask a very simple question: To what extent does the interactome map as we have it right now, how can it play the role of a Facebook network beyond the phone book lists?" he said.
They found that products of known cancer genes, from the Sanger Cancer Gene Census list, are more likely to interact with one another than would be expected by chance.
Additionally, products of genes identified through genome-wide association studies to be candidate cancer genes — another list — tended to be more commonly connected to Cancer Census proteins.
For example, Vidal and his colleagues focused on the C-terminal Binding Protein 2 (CTBP2) gene, which encoded near a locus linked to prostate cancer susceptibility. IKZF1 and FLI1, two Cancer Census genes, encode proteins that interact with CRBP2, according to the new interactome map. As those two genes have been implicated in lymphoid tumors, the researchers investigated whether CRBP2 may play a role in lymphoid tumorigenesis.
According to the Cancer Cell Line Encyclopedia, FLI1 is commonly amplified in lymphoid tumors, while CTBP2 and IKZF1, but not CTBP1, are commonly deleted in those tumors. This, the researchers noted, suggests a role for CTBP2 in suppressing lymphoid tumorigenesis by directly repressing FLI1 function, possibly in tandem with IKZF1.
They added that many novel cancer candidates could be teased out based on their interactions with known cancer gene products in cellular pathways.
"It's basically using Facebook and connecting-the-dots-type thing to say, 'Can we go from a list of really well known cancer genes to a list of putative genes and make hypotheses this way?'" Vidal said. "And overall it seems like it is working."
Vidal noted, though, that the 13,000 or so genes they drew upon to develop this interactome map, falls short of the full expanse of interacting proteins. He and his colleagues aim to cover the interactions of 20,000 genes by the year 2020, and they now have the tools in house and automation to take on this matrix of 200 million combinations, he said.