Microsoft Research has included a bioinformatics project among several “key areas” it will support in 2008 under an award program sponsored by its External Research & Programs group.
The company this week announced six requests for proposals totaling $5 million that it plans to issue under the program, which will pay academic research groups to conduct projects that “complement the company’s overall research agenda and goals,” according to a statement.
Among the areas slated for funding in 2008 is a program to develop improved data-analysis tools for genome-wide association studies. Microsoft Research intends to award a total of $700,000 to several recipients under this program, Kristin Tolle, a program manager in the External Research & Programs group who is overseeing the bioinformatics RFP, told BioInform this week.
Tolle said that the individual awards will range from $80,000 to $150,000.
Microsoft plans to issue the RFPs through its External Research & Programs homepage in late August. Proposals will be due by the end of January. Recipients will be announced in March after peer review by a panel of internal and external experts.
‘Pushing the Envelope’
Tolle said that the program is designed to fund higher-risk projects that agencies like the National Institutes of Health or the National Science Foundation would typically avoid.
“I would encourage [researchers responding to the RFP] to really think about breakthrough ways” of approaching the challenge of data analysis for genome-wide association studies, she said. “Don’t think about reinventing the wheel. The hope here is that we really do end up pushing the envelope.”
Sailesh Chutani, director of External Research & Programs at Microsoft Research, said that while the program’s funding model is “very similar” to that practiced by NIH or NSF, “the idea is that these funds are used to do some of the proof-of-concept work, to take some of the technical risks out. And then once they do that, we expect that the people we fund will go to the NIH or the NSF and raise more money because they will have taken some of the risk out, [and] they will have proved some of the initial hypothesis and served up a proof of concept.”
At least one bioinformatics researcher agrees that the Microsoft program supports work that otherwise may not have been funded. Joel Bader, an investigator at Johns Hopkins University and the recipient of an award under last year’s “Computational Challenges in Synthetic Biology” program, told BioInform that the funding “is really letting us move forward with a project that we were having difficulty getting funded through conventional mechanisms.”
The project, called BioStudio, is an editing and revision-control environment for designing synthetic genomes — specifically, reengineered versions of yeast that will exhibit properties that are more useful for experimental research.
“It’s kind of a futuristic, out there project at this point,” Bader said. “Since it’s so new, it’s hard to convince people that it’s going to be important to make synthetic species, so then if it’s not so important to make synthetic species, why is it so important to have a software infrastructure that lets a team of people redesign a genome?”
Bader said that his team was funding the project internally prior to applying for the Microsoft award, which “really helps us move everything along.” In addition, he said, the group has recently received “favorable funding news” for the project from other funding bodies — a trend he attributed to the “jumpstart” from the Microsoft program.
Microsoft’s Chutani said that the goal of the External Research & Programs group is to foster “interesting problems that would have an impact, potentially, on our business or on our research interests five years out or so.” In particular, he said, the program looks for projects that can exploit Microsoft’s technical assets or know-how, including “products, technologies, or some of the research tools that we may have developed for our own use.”
The group has a number of mechanisms for funding academic research, and the RFPs announced this week are reserved for high-risk projects that involve “a broader exploration in an area that is new to us,” Chutani said. The company also offers larger awards with a “deeper engagement” for individual investigators, as well as awards in the million-dollar range for “joint institutes” between Microsoft Research and university labs.
Traditionally, the External Research & Programs group has been focused on computer science, “but more and more it’s also looking at how computer science can help solve problems in the sciences, and especially in healthcare,” Chutani said.
He estimated that about a quarter of the group’s external engagements are currently in the healthcare realm and noted that the group is “ramping up quite significantly” in this area.
Within healthcare, the External Research & Programs group is focusing on three areas, he said. “We’re looking at how you can use information and communications technology to improve access to healthcare, how you can use ICT to improve the quality of what’s delivered, and how you can use ICT to speed up the breakthroughs.”
‘Keen Interest’ in GWAS
The genome-wide association study RFP falls into the latter of those three categories, Tolle said.
She said that Microsoft research has identified three specific areas within the GWAS realm that it would like proposals to address.
The first area is improved tools for accessing, analyzing, and visualizing data in the HapMap repository, which is currently hosted at Cold Spring Harbor Laboratory. Tolle said that Microsoft Research isn’t collaborating directly with that effort, but is working on another research project at CSHL.
“What we’re hoping is that the people who apply for the RFP will be collaborators with them as well, but it’s not going to be a requirement,” she said. “It certainly would strengthen a proposal for acceptance if they were collaborating with Cold Spring Harbor.”
The second focus area under the GWAS RFP is data standardization for the collection of genetic information. “We’d like to see if we can improve that through semantics and database schema,” she said.
“A lot of the algorithms are fantastic, but the biggest problem that [researchers] face is the ability for them to perform in a reasonable amount of time, and if you could parallelize them, perhaps you would be able to do more.”
This initiative is of some interest to an “incubation team” at Microsoft called the Health Solutions group. “That organization does a lot of data management, so of course they’re going to care a lot about the standardization of genetic capture of information,” Tolle said.
Finally, Microsoft Research hopes to use the RFP program to improve the accuracy and performance of algorithms for analyzing GWAS data. Microsoft’s high-performance computing team has a stake in this initiative, she said. “A lot of the algorithms are fantastic, but the biggest problem that [researchers] face is the ability for them to perform in a reasonable amount of time, and if you could parallelize them, perhaps you would be able to do more,” she said.
More broadly, Tolle said, at least one Microsoft Research employee has a “keen interest” in genome-wide association studies. David Heckerman, manager of the Machine Learning and Applied Statistics Group at Microsoft Research, is working on the rational design of vaccines for malaria and HIV and has already released several software packages that he developed as part of his work in this area [BioInform 06-15-07].
Heckerman is “extremely interested in moving into genome-wide association studies,” Tolle said. “So, because we have an outlet within Microsoft Research, a researcher who can pick up that technology and move forward, that also makes it of keen interest.”
An Open Source ‘Microsoft Office’ for Bioinformatics?
Chutani said that Microsoft Research has several technologies in hand — such as SQL server, data-mining tools for Excel, and its F# programming language for the .NET platform — that could serve as a foundation for an improved GWAS-analysis platform in the short term.
In the longer term, he said, the company intends to use these tools and others that are still under development as the building blocks for “a toolkit for bioinformatics.”
Tolle said that Microsoft Research is currently working with several undisclosed researchers “to find out what tools are out there and where the gaps are … and once we have completed that analysis we’re going to fund people specifically to fill in those gaps.”
The goal, she said, is to create a “complete set” of tools for bioinformatics researchers. “If there’s one thing that we’d love to provide, it’s … a good end-to-end [system] that covers at least the … 80 percent of tools that a bioinformatics researcher would need to do their job well.”
It is still too early to know whether this planned toolkit would be sold as a pre-packaged “Microsoft Office” of sorts for bioinformatics, Tolle said. Initially, she said, the tools will be released via CodePlex, Microsoft's open source project hosting web site.
Microsoft recognizes that “researchers are not like other users,” Chutani said. “They tend to lift the hood, they make modifications, they’re looking for extensibility.”
He added that the company plans “to make sure that for most of these tools we make the source code available so that people can share that. Most of these are distributed under BSD-style licenses so people can share and add to it.”
Chutani and Tolle were unable to provide a timeline for the development of the bioinformatics toolkit.