By Aaron J. Sender
Taking a sabbatical at Brandeis University last year may have been the most important move of Mary Jo Ondrechen’s academic career.
After 21 years as a Northeastern University chemist, it was in a Brandeis lab that she discovered a simple and quick method to pinpoint the specific amino acids that make up a protein’s active site.
Ondrechen had been studying predicted titration curves of single amino acids in vitamin-B6-dependent enzymes and noticed something really strange.
“What normally happens when you titrate something is that as you slowly raise the pH by adding base all of a sudden there is a precipitous drop in charge, just a very fast fall-off in charge,” she says. In other words, if the initial net charge is +1 it will suddenly drop to zero. Or if it starts at zero charge it quickly drops to ¯1.
For some reason, though, a small, single-digit percentage of the curves on Ondrechen’s computer were strikingly flat. When she showed the results to other researchers, many were quick to dismiss her observations as an artifact of the method: “They all told me I was crazy,” she says. But while talking it over with Brandeis collaborators Dagmar Ringe and Jim Clifton, Ondrechen immediately knew she had hit on something big. “I said, ‘My God, we’ve got a way to find the active site. If you give me the structure of the protein, I can tell you where the active site is.’”
To recognize the clues nature left, Ondrechen used her chemical intuition. For a protein to interact with its substrate there must be some acid-base chemistry going on, she reasoned. The flat-liner residues, then, must be the amino acids that prolong their active form over a long pH range and are thus part of the active site.
Drug makers look to a protein’s active site for clues to its function and to design drugs that fit snugly within its folds. Other methods rely on comparisons with similar proteins of known functions or on the tedious process of screening hundreds of thousands of small molecules to home in on the active site.
Ondrechen’s computational method, called Thematics, for theoretical microscopic titration curves, requires no prior knowledge but the protein’s three-dimensional structure. “On a Compaq Alpha workstation, a smallish protein, where there are only a couple hundred residues to do, can take less than an hour of CPU time for the whole calculation,” Ondrechen says. “It just starts at the coordinates and puts out the titration curves.”
Since the publication of Thematics in Proceedings of the National Academy of Sciences in late October, her phone has been ringing off the hook. “I’ve gotten lots of inquiries from pharmaceutical companies saying, ‘Can I have your program?’” Ondrechen says.
But she still hasn’t decided yet whether to go the commercial route. It depends if she gets the federal funding she is seeking for a large public database she envisions of active sites overlaid onto protein structures.
First, though, she must solve a major bottleneck: to find the telltale residues, lab members must manually scour hundreds of titration curves generated for each protein. “We just use the human eye,” she says.
Now back at Northeastern, Ondrechen is collaborating with Bob Futrelle and Ron Williams of the school’s computer science department to develop machine-learning algorithms to automatically pick the statistically significant curves and eliminate human intervention. “And then we can do proteome wide screening,” she says.