Skip to main content
Premium Trial:

Request an Annual Quote

Seer of Structures


David Baker picked proteomics because he didn’t want to be bored — now his structure predictions are wowing experts


By Meredith W. Salisbury


Earlier this year, a small protein folded the wrong way. Sure, proteins make folding mistakes all the time. But this particular protein, produced by streptococcal bacteria, twisted and turned in the complete opposite direction it normally does — and it did it 100 times faster.

This made David Baker a very happy man. He and his colleagues, who first studied the original protein and possible changes to it in silico and then manufactured a gene to express the altered protein, didn’t hop on the train to scientific stardom at their success. Instead, Baker contends that the experiment was simply “a very good test of one’s understanding of the basic principles.”

Basic, schmasic. Baker is recognized by his colleagues as one of the best and brightest proteomics pupils of the day. At 38, he’s a Howard Hughes Medical Institute investigator and an associate professor with his own lab at the University of Washington. He was instrumental in developing Rosetta, a structure prediction algorithm that’s won him kudos in the field. He’s also a co-founder of Prospect Genomics, later acquired by Structural GenomiX.

Baker’s been at the university for eight years, and his studies primarily involve protein structure prediction, protein folding, protein design, and protein-protein interaction. He readily admits that this variety stems from his knack for getting bored. “That’s why when you asked me what the focus of my research was I sort of mumbled,” he half-jokes, “because it changes all the time.” This has in part led to a highly interdisciplinary approach to problems: he uses biophysics and molecular biology as well as computational and combinatorial library methods in his experiments.

It’s also why his affiliation with HHMI has been a boon. The institute provides resources and endorses projects that some public-sector entities couldn’t afford. “Imagine if someone gave you all these resources and said, ‘Do the greatest things you can with them,’” he explains.

Babe Ruth

Critical Assessment of Protein Structure Prediction (CASP) isn’t supposed to be a competition, but many of the participants view it as one. Held biannually, CASP encourages innovative structure prediction throughout the community. The sequences of several proteins whose structure has been solved by NMR or crystallography but are not published are distributed to teams internationally. Those teams have several months to determine what the structure ought to be, and their results are judged against the established structure.

It’s an excellent test for someone like Baker, who can’t pry himself away from the challenge of solving an unknown structure. “If you just did things that were obvious it wouldn’t be much fun at all,” he says.

That’s where Rosetta comes in. Begun in 1996, the algorithm breaks up the sequence of a protein with no known structure into small pieces and then compares those against bits of sequence of proteins with a known structure. Because folding in proteins is highly localized, it’s possible to predict folding — and then final structure — by viewing the molecule as lots of smaller molecules all folding according to their own procedure and monitoring how they would interact.

At CASP3 in 1998, Baker’s team proved the legitimacy of Rosetta with more success than most of the techniques available. “It had become one of the better methods in cases where there isn’t already an example [of structure],” Baker says.

But that didn’t prepare anyone for what Rosetta did at CASP4. Evaluators of the structure submissions use a basic point system; Baker’s group wound up with 31 points, and the next highest team had eight. At the time, Peter Kollman from the University of California at San Francisco compared Baker’s success to Babe Ruth’s record-smashing home-run triumphs in 1927.

“Certainly the Rosetta program that he’s put together and his ability to do ab initio structure objectively is perhaps the best thing going in the field,” says Irwin Kuntz, who was a Prospect Genomics co-founder along with Baker.

Mr. Modest

Despite the accolades, Baker avoids walking around with a swelled head (though his wild halo of hair could easily be confused for it). Left to his own devices, he’d rather talk about his family — he has two sons, 9 and 7 — or his zeal for hiking and skiing or how great living in Seattle is.

Arthur Horwich, a Yale professor and HHMI investigator who has collaborated with Baker, recalls his demeanor after the success of CASP3. “He was acutely aware of where he had not gotten things correct, but he was really modest about the things he was right on.”

Baker is the first person to point out the flaws in Rosetta’s models, saying they’re “not very accurate, not high resolution.”

That’s typical, according to his postdoc advisor, David Agard of UCSF. “He doesn’t like to dwell on [successes] at all,” he says. “He just thinks of being excited by science as opposed to thinking of himself as a brilliant scientist.”

Baker’s appreciation for the sport of science harks back to his undergraduate days at Harvard, where he began taking biology courses during his final year. He focused on cell biology and got his PhD in biochemistry from the University of California at Berkeley in 1989. He wound up in Agard’s lab, where he was drawn to the multifaceted nature of proteomics. “It’s definitely a biology problem,” Baker says. “No naturally occurring polymers would fold up like this. Yet it’s much simpler than problems of cell biology because after all, it’s just a long molecule that should obey all the laws of chemistry.”

Kuntz, who met Baker at Agard’s lab, remembers him for this ability to reduce complex problems. At the time, Kuntz says he thought Baker’s approaches were naïve, but now he recognizes them as part of “David’s … stereotypically outside-the-box thinking and fresh perspective.”

Most Wanted

Having in some ways conquered CASP, Baker is spearheading a new project known as the 10 Most Wanted. At CASP4 last year, people hit on the idea that though the program’s competition had led to remarkable improvements in the field, it might be even more beneficial for them to work together.

So Baker and his peers are organizing a program to select 10 proteins (based on scientists’ arguments for how important they are) and attempt to figure out their structures. With the combination of all the experts’ methods, Baker hopes, the models will be more accurate and provide better functional information than even the best predicted models of CASP.

With this project just kicking off, there’s clearly still plenty of unknown turf for the ever-shifting Baker to explore in proteomics. No one expects him to switch gears out of the field anytime soon. But if there were suddenly a method that fulfilled his quest for structure prediction with significant accuracy — well, then, Agard says, “it wouldn’t be surprising at all if he found some new grand challenge.”


The Scan

Study Reveals New Details About Genetics of Major Cause of Female Infertility

Researchers in Nature Medicine conducted a whole-exome sequencing study of mote than a thousand patients with premature ovarian insufficiency.

Circulating Tumor DNA Shows Potential as Biomarker in Rare Childhood Cancer

A study in the Journal of Clinical Oncology has found that circulating tumor DNA levels in rhabdomyosarcoma may serve as a biomarker for prognosis.

Study Recommends Cancer Screening for Dogs Beginning Age Seven, Depending on Breed

PetDx researchers report in PLOS One that annual cancer screening for dogs should begin by age seven.

White-Tailed Deer Harbor SARS-CoV-2 Variants No Longer Infecting Humans, Study Finds

A new study in PNAS has found that white-tailed deer could act as a reservoir of SARS-CoV-2 variants no longer found among humans.