A new consortium launched with the aim of understanding the structure and dynamic function of membrane proteins will likely result in an improved informatics toolbox for modeling proteins and other biomolecules.
Last month, the National Institute of General Medical Science launched the Membrane Protein Structural Dynamics Consortium, an interdisciplinary team of researchers that aim to use biophysical and computational methods to understand the structure, function, and dynamics of membrane proteins.
The consortium, funded by a five-year $22.5 million "glue grant" from NIGMS, comprises 30 scientists from 14 institutions in four countries and is hailed as one of the largest and most comprehensive projects focused on membrane proteins to date.
Led by researchers at the University of Chicago, the consortium also includes scientists from Cornell University, Columbia University, Germany's Johann Wolfgang Goethe-Universität, the National Institutes of Health, Stanford University, University of California-Los Angeles, University of Illinois, University of Pittsburgh, University of Toronto, University of Virginia, University of Wisconsin, the Netherlands' Utrecht University, and Vanderbilt University.
The project is expected to help scientists better understand a wide range of ailments caused by faulty membrane proteins, such as some forms of heart disease, diabetes, and neurological and hormonal disorders. Ultimately, since more than half of drugs currently on the market target membrane proteins, the project could pave the way for developing new or improved drugs and therapies.
The project will involve a combination of structural biology, magnetic resonance, fluorescence spectroscopy, biochemistry, and biophysics techniques, and will also have a very large computational component. Rather than assign the task of computational tool development to individual labs, the consortium has opted to delegate the task to a select group of researchers, dubbed the computational core, who have considerable expertise in that arena.
Benoit Roux, a professor of biochemistry and molecular biophysics at the University of Chicago and one of the members of the core, told BioInform that the group aims to develop tools centered around four specific aims: to generate different kinds of force fields for proteins; integrate data at different scales and resolutions; identify pathways between different conformations of proteins; and interpret spectroscopic measurements, which are often "ambiguous."
Along similar lines, Harel Weinstein, a professor of physiology and biophysics at Cornell University, said the group will develop tools for "large-scale simulations" of protein structures that realistically represent the protein and its fluid environment, as well as tools to let researchers look at protein dynamics based on known structures and to interpret "a new generation of data" from things like single molecule experiments. The team also hopes to improve on "sophisticated" tools that are currently used in computational biology labs.
"That is quite complicated and requires several levels of new thinking … in terms of the use of computational resources [such as] parallelization and new types of processors," he told BioInform. "We are trying to develop analysis tools that reduce the data to the same kinds of components [that] everybody [can] understand — even those who are not using the [same] experimental tool[s] for data acquisition."
Roux said that the developers plan to use existing tools that they have developed in their labs, which they will "extend" to work in a more "general fashion."
"It’s a challenge because [the proteins are] a bit more disparate … you have a channel, a pump, you have maybe a receptor … being analyzed," he said. "The tool has to be general enough that it's suitable for all these problems but not so general that it becomes a bit abstract."
An additional challenge for the core, Weinstein said, will be to develop tools that can be easily used by experimental biologists involved in the consortium who don't have intensive bioinformatics training. On that score, he said the consortium is considering conducting workshops that will provide training on how to use the tools.
The computational core also plans to develop a centralized database to store and integrate project data in a format that consortium members can make sense of no matter what tool was used to collect the experimental data, Weinstein said.
The planned database will be different from repositories like the Protein Structure Initiative's Structural Biology Knowledgebase in the sense that "the same people who create the tools for dissemination and disseminate the data are the people who collect the data" he said.
"I could put my data in [PSI]; it’s a very useful tool," Weinstein acquiesced. "What is different is [the consortium's database] is created for people who work together and who refine the database all the time as they work."
The PSI knowledgebase is a product of the Protein Structure Initiative launched by NIGMS in 1999 with the long-term goal of making it easy to obtain three-dimensional atomic-level structures of proteins by looking at their DNA sequences.
Filling the Gap
Computational core member Ivet Bahar, a professor of computational biology at the University of Pittsburgh, told BioInform that her team plans to use "elastic network models" to analyze membrane proteins without compromising the resolution.
She said that although her models, developed over a decade ago, have been used to study other molecules, reworking these tools to work for membrane proteins is a greater challenge because they have to take into account the membrane as well as how the protein interacts with the membrane, among other things.
Membrane proteins, along with the lipid and water molecules that surround and interact with them, create large complex systems that are difficult to model, explained Bahar.
"My lab simplifies the analysis of such complex systems by using simplified models … [that] help us understand the mechanisms [and] the machinery of [the] biological processes," she said.
According to Bahar, the models her team develops "approximate" proteins and other biomolecules as "elastic materials" such as beads and elastic springs, which provides insights into which structures are the most "energetically favorable" for different functions.
[ pagebreak ]
"The logic is [that] each [protein's] architecture has been designed to favor particular molecular movements … some types of structural changes will be more probable than others," she explained. "The more probable structural changes are actually what [are] needed for [the proteins] to achieve their biological function."
However, Bahar noted that while in silico molecular simulations can compute processes that occur in nanoseconds, they can't simulate protein activities that last on the order of hundreds of microseconds or milliseconds, without losing resolution.
"We are trying to fill the gap. We are building lower-resolution models in order to be able to reach longer time processes, simulate them, and be able to relate them to experimental data," she said, adding that her team will work with other core members to develop "hybrid models" that adopt a "multiple scale approach" to analyzing membrane proteins.
A Marriage Made in Heaven
Klaus Schulten, a professor at the University of Illinois at Urbana-Champaign, told BioInform that using computers to describe the proteins and the surrounding membrane is like a "marriage made in heaven" because computers can deal with the "technical challenge" of describing the "volumes" of data that are generated from protein studies.
He explained that studying membrane proteins and their surrounding environment can involve describing 100,000 atoms for smaller proteins and up to one million atoms in the case of larger proteins.
"Computers are very good in dealing with membranes because the membrane is not a very well-structured entity," he said. "There is no other way really to get a snapshot of something that is so disordered and fluid."
Schulten's lab has developed NAMD, which is a molecular dynamics code designed to simulate large biomolecular systems and VMD, which is a molecular visualization program that’s used to display, animate, and analyze large biomolecular systems in 3-D. Schulten said that both programs, which are currently used by over 160,000 users worldwide, have been used to describe membrane processes.
For this project, the team will focus on making its computer simulations predict protein activity more accurately, which Schulten said will require simulating up to a hundred million atoms.
He noted that getting the tools to handle larger quantities of data should be "pretty easy" because computers have consistently gotten larger; however getting them to run faster is a "very difficult problem."
"We really need to think … [about] how to find better tricks and much better algorithms to speed up the computation," he said, noting that his lab has "made inroads" in that space by speeding up its programs in the last two years by a factor of 100. He said that he plans to increase the speed by at least the same amount for this project.
For its part, Weinstein's lab will focus on developing tools that can be used to study how the proteins' activities are "propagated" in the membrane, including interactions with other cell components involved in signaling, for example, using "a variety of multi-scale approaches."
Weinstein said that this would involve making the tools "more rigorous" to better represent the protein systems as well as covering time segments "at the level of the biological processes."
A third step involves incorporating the different components that interact with the protein into the models, Weinstein said.
He explained that in the cell membrane, proteins are often attached to "sugar-like" chemical components, and noted that there currently aren’t any "good tools" that can compute the dynamics of a cell membrane, a protein, and the "sugar decoration" at the same time.
'We Want to be Ambitious'
Roux noted that the computational core wants to be "ambitious" when it comes to tool development for the project, but he added that increased ambition means increased challenges.
"It's better to have a moderate success than to have a total failure on something that’s overly ambitious," he said. "We will want to modify these [tools] to make [them] as exciting as possible but we need to make sure that at the end, something comes out that is useful."
Weinstein said he expects that there will be a lot of overlap in the tools' functions since each computational core member has worked on membrane proteins for several years. However, he said each research group has a "particular strength" that they contribute to the team and added that "the eyes are on us" to make progress that is "qualitatively and quantitatively bigger than individual projects."
"My expectation of myself and of my lab and of my colleagues is that we will take what we have and really turbo charge it," he said. "That is, make it more accurate [and] efficient so that [we] can compute larger parts of the very large and complex biological systems that we study."
He said that although the project was funded for five years, continued funding is contingent on the performance of the project and the consortium plans to "do our work well" in order to receive additional funds for another five years.
Roux said that the team plans to have the tools refined and available for widespread use outside the consortium within a few years, adding that the groups have no plans to market any of the tools developed for the study.