Skip to main content
Premium Trial:

Request an Annual Quote

Playing at Protein Prediction


Any large, distributed computing network aimed at utilizing unoccupied desktop computers can harness substantial computing power rivaling that of any supercomputer. And when it comes to protein research, the amount of computing power needed to simulate and predict molecules can easily be provided in such a model. To this end, the life sciences community made its first big splash into distributed computing with the launch of Stanford University's [email protected] back in October of 2000. Users download the app, which then runs as a screen saver displaying real-time molecular dynamic simulations and feeds results back to the home at Stanford. According to [email protected]'s website, roughly 3 million people have downloaded the program, with more than 300,000 users currently running the utility.

Similarly, [email protected] is a distributed computing project that mirrors almost  exactly the design of [email protected], a distributed software program used to calculate radio signals from space. [email protected] is geared toward predicting three-dimensional protein structures as well as designing new proteins that do not occur in nature. Since its release in 2005, [email protected] has delivered significant advances in protein structure prediction, with a particular emphasis on designing new enzymes, says David Baker, a professor of biochemistry at the University of Washington in Seattle. The program has contributed to a range of Baker's disease research, which includes malaria, anthrax, HIV, Alzheimer's disease, and cancer.

While [email protected] has proven itself to be an effective tool for Baker and other researchers, those individuals kind enough to have installed the program on their desktops began reporting some curious observations. "People would write in and say, 'Hey, I've been watching the screen saver for a while and it's not always doing what I think is the right thing,'" Baker says. After an inspiring hike up Mount Rainier, Baker and a colleague stumbled upon an interesting idea: instead of concentrating on harnessing unused computer cycles, why not harness unused brainpower? If users were already taking the time to merely stare at their [email protected] screen saver, watching it waste time with clearly pointless calculations, it might be more effective to put them in the driver's seat.

Fun with Foldit

The solution is an interactive version of [email protected] called Foldit. Released in May, the 3D protein folding game has already garnered a great response for fans of [email protected], with more than 40,000 downloads at the time this article went to press. Admittedly, the game is best understood by playing it, but the basic idea is that players are tasked with twisting and turning proteins in order to get the lowest energy structure, which then results in the highest score. At first, players are instructed through introductory levels with a cartoon version of Baker providing instructions and helpful hints along the way. Once the player hones his folding skills, he or she is then set to work on competitive puzzles that are much more challenging.

The solutions to these folding puzzles are relayed back to the Foldit server, and then evaluated for their research potential. The best solutions go on to be tested to determine their catalytic activities. Baker says the competitive nature of the game, which has users from around the world, helps to ensure a never-ending stream of protein folding solutions. "There is a world leaders list so you can see where you rank compared to everybody else, and when you make improvements your rank goes up," says Baker. "Distributed thinking should be pretty powerful, so this is an experiment to find out to what extent networks of people working on competitive games can solve hard scientific problems."

In addition to providing a boost to his and others' protein research, Baker says the educational benefits of Foldit are obvious. "I think it could really transform science education because you learn so much about proteins, macromolecules, and biochemistry, which is very hard to convey in a boring high school science class," he says. "The dream is that there could be this 12-year-old kid in Indonesia who can see things in his head, like some kind of protein savant, and end up designing an inhibitor for the flu virus."

Although commercial vendors do offer drug discovery tools that aid in some aspects of molecular dynamic and small molecular docking software, none of them really comes close to what Foldit and [email protected] offer, Baker says. Right now, Foldit runs on Windows platforms and Mac OSX, but has not yet been ported to Linux or gaming consoles, like the PS3 or Xbox. "One problem is that Rosetta is a big, complicated program, so the challenge of getting it ported to the PS3 or Xbox would be larger than for molecular dynamics code," he says. "But if Sony came to us and said, 'We'd like to get your software running on PS3 or Xbox,' we'd be extremely happy."

The GPU advantage

Hardware manufacturers are providing protein simulation research help in other ways. The [email protected] project has recently been given a tremendous boost with the porting of its code to Nvidia's chips via the company's CUDA GPU programming language (check out GT's coverage of Nvidia in the September '07 issue). [email protected], which gained a lot of attention when it was ported to run on Sony PS3s, was initially intended just to run on CPUs. They did offer a primitive GPU version early on, but programming for GPUs before tools like CUDA was a brutal, error-prone task. Last year, the [email protected] team successfully ported the code to run on ATI's graphics cards, but with Nvidia's chips now in the mix, the sky is the limit in terms of potential computing power.

Vijay Pande, an associate professor at Stanford University who leads the [email protected] effort, says that the addition of the Nvidia chips will provide computing muscle to rival that of any supercomputer. "There are about 70 million Nvidia GPUs worldwide that could run [email protected], and the average chip can provide 100 gigaflops of computing power," Pande says. "If we were to just get 0.1 percent of those 70 million, that's 70,000 — and each of those 70,000 is producing 0.1 teraflops so that's something like 70 petaflops performance." While supercomputer enthusiasts are busy squawking about achieving one or two petaflops' worth of performance, [email protected] will come close to breaking the 100-petaflop barrier in the near future, he says.

And for those of you whose heads start swimming when talk of flops abounds, Pande believes that at the end of the day, what really matters is how these distributed solutions can alleviate disease. To this end, Pande's group is putting [email protected]'s simulation power to use on real-world problems, such as discovering molecules that might help inhibit toxicity for Alzheimer's disease. "We talk a lot about technology, but it's not very interesting unless it does something," he says. "Some of these applications from the technology will start coming out in the next year, and I think that's going to be the exciting part."

The Scan

Germline-Targeting HIV Vaccine Shows Promise in Phase I Trial

A National Institutes of Health-led team reports in Science that a broadly neutralizing antibody HIV vaccine induced bnAb precursors in 97 percent of those given the vaccine.

Study Uncovers Genetic Mutation in Childhood Glaucoma

A study in the Journal of Clinical Investigation ties a heterozygous missense variant in thrombospondin 1 to childhood glaucoma.

Gene Co-Expression Database for Humans, Model Organisms Gets Update

GeneFriends has been updated to include gene and transcript co-expression networks based on RNA-seq data from 46,475 human and 34,322 mouse samples, a new paper in Nucleic Acids Research says.

New Study Investigates Genomics of Fanconi Anemia Repair Pathway in Cancer

A Rockefeller University team reports in Nature that FA repair deficiency leads to structural variants that can contribute to genomic instability.