For those who might fancy a protein interaction map to go with their genomic sequence of choice, genome sequencing service center Agencourt Bioscience says it has just the thing. The Beverly, Mass.-based company has jumped into the proteomics arena for the first time with the publication of a bacteria-2-hybrid-derived protein interaction map for the pathogen Rickettsia sibirica, in the hope of capitalizing on what it sees as a strong unmet need among researchers. “We don’t see a lot of people out there other than, say, CuraGen, who are going to build the large-scale interaction mapping that I think a lot of people want,” said Joel Malek, senior manager of genomics and proteomics at Agencourt and first author on the Feb. 10 Nucleic Acid Research paper.
In November, CuraGen publihed a whole proteome protein interaction map for Drosophila in Science. (see PM 11-7-03). “Just the small amount we did, we’ve gotten a huge amount of feedback from researchers who are interested in us continuing,” Malek said. Agencourt has been doing research in protein interactions for a year and a half, according to Malek, but only began offering its work as a service a few months ago. He said that the company was developing other proteomics-related offerings as well, but did not say what they would be.
John Chant, head of genomics and proteomics at CuraGen, said that he was encouraged by the interest Agencourt was generating. “They’re showing that at a very early stage, people are appreciating the importance of protein-protein interactions,” he said.
Agencourt’s approach uses a variation on the traditional yeast-2-hybrid protein interaction discovery method, where instead of using yeast, Malek’s team uses E. Coli.
The bacteria 2-hybrid method was an obvious choice for a genome sequencing company, Malek said, because “it flows right into a high-throughput DNA sequencing pipeline.” Rickettsia sibirica, which causes a type of spotted fever, was chosen as the first organism for mapping because of its small genome and similar translation table to E. Coli, and because studying a human pathogen fit in with the relationship that Agencourt already has with the US Centers for Disease Control and Prevention for pathogen sequencing.
Agencourt has capabilities to produce 50,000 to 60,000 sequences a day, Malek said, so the higher throughput capabilities of the bacteria-2-hybrid method was appealing. “The yeast-2-hybrid suffers from being labor intensive and having a low screening power, and once you come to detection of the interactions, it takes quite a bit more effort. … The bacterial system allows you to do all of that a whole lot faster and deeper,” Malek said. He added that proteins from eukaryotic organisms such as humans were less likely to cause a lethal response in bacteria than in yeast — which might interpret those proteins as having an internal function.
But as Malek acknowledged, bacterial systems have their drawbacks too, most notable of which is their inability to add post-translational modifications to the proteins. “[M]ost of us biologists imagine that you’d have a better chance of seeing an interaction by expressing the proteins in a eukaryote,” said Russ Finley, an associate professor at the Center for Molecular Medicine and Genetics at Wayne State University. On the other hand, he added, “there are certainly plenty of interactions that don’t require any modifications to the proteins that you should be able to detect in yeast or bacteria or even in vitro.” Finley said that he had not before seen anyone put together a high-throughput bacteria-2-hybrid system, and so it was not yet clear how it would match up with high-throughput yeast systems.
Whether the system is bacteria or yeast-based, a major problem in all protein interaction mapping systems is the occurrence of false positives and other uncertainties, making it “really hard to prove that two proteins don’t interact in real life,” Finley said. “This is one of the biggest problems in these protein interaction maps now: People don’t have a way of estimating the false positive rates.” Really what people are doing when they try to estimate false positives, he said, was estimate the true positive rate and then surmise the false positive rate from that. But this is an imperfect method that does not seem to currently have a better solution, Finley said.
The Agencourt group is tackling this problem and the companion problem of false negatives in a manual fashion. To increase confidence that he doesn’t have false negatives or false positives, Malek’s group takes a “fragment approach,” where for every bait, the protein is broken into about five fragments and each one is screened. “It’s been shown quite a few times that if you take that approach, you eliminate false negatives,” Malek said. “As far as false positives go, it’s sort of a compilation of evidence when you see multiple overlapping fragments causing an interaction, you can take more confidence in that interaction being real.” He said that next he hoped to “do a few more microorganisms,” and that once multiple maps were available for a variety of organisms, it would be possible to “start to compare data and really see what’s real.”
Unlike CuraGen, which used an algorithm to sort through the interactions and assign confidence ratings based on theoretical protein interaction feasibility, Agencourt is not using a computational approach at this time. Chant said that assigning confidence ratings manually was “self-limiting.” Although it could be done, a computational approach would be ultimately more efficient and useful in the end. “Our algorithms incorporate [their methods] in a more sophisticated manner, but I think [for] the whole field of interaction mapping, having computational methods for understanding the confidence of these very large data sets [is] essential,” he said.
In the meantime, Agencourt is focusing on expanding its service to incorporate interaction maps for more organisms, and, later on, more proteomics services. “This is the needle on a big push,” Malek said.