BOSTON — After eyeing the life sciences from a relatively safe distance for several years, Microsoft this week signaled its commitment to the sector with the launch of the BioIT Alliance, a network of industry partners enlisted to ensure that its ubiquitous desktop tools meet the demands of biomedical research.
Microsoft announced the alliance here this week at the Bio-IT World Life Sciences Conference and Expo, where the majority of attendees welcomed the software giant's pledge to address the sector's challenging IT requirements.
"We are very pleased that Microsoft is working to help make their tools more applicable to the field of life science research," said Steve Lincoln, vice president of informatics at Affymetrix, a founding member of the alliance.
Even organizations that are not part of the initiative described Microsoft's effort in positive terms. "I think it's a good move," said Matt Shanahan, chief marketing officer for Teranode. Currently, Shanahan said, Microsoft's tools aren't flexible enough to enable researchers to freely exchange data with other systems, but the company's effort promises to improve interoperability and collaboration, which is "a step in the right direction," he said.
The company has enlisted a number of industry heavy-hitters as founding members of the alliance, including Affy, Accelrys, Applied Biosystems, Amylin Pharmaceuticals, and the Scripps Research Institute. Other founding members include the BioTeam, Digipede Technologies, Discovery Biosciences, Geospiza, HP, Sun Microsystems, and VizX Labs.
"People will find whatever way they can to take their data out of whatever software format is coming out of any vendor's instrument and get it into Excel and manipulate it."
Don Rule, platform strategy advisor at Microsoft, told BioInform that the primary goal of the alliance is to ensure that upcoming versions of its productivity tools — including Office 2007 — can handle the complex challenges of biological research.
Microsoft has identified "a transformation going on" in the life sciences, where there is a "strong need for software that will speed the process of discovery," he said. Because many scientists already use the company's software, he said, "We recognize that we have an important role to play" in the market.
It's not all about the science, however. The sector's well-known challenges of data integration, data management, and collaborative research serve as an effective "proof point" for features that would be applicable across the company's suite of tools, Rule said. Biology is to vertical markets what Broadway is to show business, he said: "If you can make it there, you can make it anywhere."
Rule said that the alliance will eventually encompass a number of working groups that will deliver "proof of concept" that Microsoft's tools can solve specific hurdles in life science informatics.
For its first project under the initiative, Microsoft is collaborating with Scripps and InterKnowlogy — a San Diego-based professional services firm that specializes in Microsoft technology — to develop the so-called Collaborative Molecular Environment, or C-ME. This system, based on Microsoft Office, Windows Presentation Foundation, and SharePoint, is expected to help streamline data capture, visualization, annotation, and archiving.
Rule said that the Scripps project was somewhat "serendipitous" in that it grew out of an existing partnership between Peter Kuhn's lab at Scripps and InterKnowlogy. This collaboration began at around the same time Rule was recruiting partners for the BioIT Alliance, so it made sense to bring the project into the fold, he said.
C-ME is being designed to "help capture more data in the lab more readily" and to help researchers easily associate that data with other entities, Rule said. Currently, he noted, most labs are only capturing around 75 percent of their data electronically, with the remainder hidden away in lab notebooks. By making it easier for researchers to annotate biological entities through familiar tools like Word and Excel, and share those annotations via tools like SharePoint, Rule said that Microsoft plans to help labs regain that missing 25 percent.
Rule said that a beta version of C-ME is expected to be released in May, and the full capabilities of the system should be available in Office 2007, currently targeted for release at the end of the year.
While the BioIT Alliance has yet to specify any future projects, Rule said there's a good chance that the next one will be focused on integrating genomic biomarkers with clinical data.
XML and Excel
Office 2007 includes a number of features expected to interest the life science community. For example, it will use XML — the format of choice for life science data exchange — as its default file format.
"We've had a lot of experience with XML and we've developed a lot of tools that use XML as the data format, so we were very gratified to see that Microsoft was going that route as well," said Tony Kerlavage, senior director of global service development and support at ABI.
"We thought that this was a great way to … facilitate getting data off of our instruments and into tools like Excel — or into anybody's tools, because if you use this open format, it doesn't tie us just with Microsoft, but it makes it very open so that people can develop their own tools," Kerlavage said.
Microsoft's version of XML is called Office Open XML, and the company has taken a number of steps to convince the IT community that it has earned the "open" moniker. Microsoft submitted the file format to Ecma International, a European standards body, for open standardization, and has also launched the Open XML Formats Developer Group, an online forum to support developers working with the format (http://openxmldeveloper.org/). The company has also posted a "covenant" on its website (http://www.microsoft.com/office/xml/covenant.mspx) that allows third parties to use the Office Open XML schema without having to pay royalties.
Microsoft's adoption of Open XML is in line with a strategy that ABI has adopted over the past year, Kerlavage said, "to make our software tools more open, and to make our data more accessible to our customers rather than being in a proprietary format."
Affy's Lincoln said that Open XML will be "helpful for the scientific community to directly pass objects from specialized scientific applications into the most commonly used desktop productivity tools." Currently, these steps often require more cumbersome data export and import steps, he noted.
The BioIT Alliance should also address some limitations in Excel, which many consider to be the most common software tool in bioinformatics. "People will find whatever way they can to take their data out of whatever software format is coming out of any vendor's instrument and get it into Excel and manipulate it," Kerlavage said.
AB had already begun building prototypes of new Excel applications before it entered into discussions with Microsoft, Kerlavage said, such as a tool for visualizing sequence data directly from the sequencer within an Excel spreadsheet.
Lincoln said that even though Excel can't handle the amount of data coming off of Affy's GeneChips, many customers like to use the application to manipulate subsets of data and computed results from microarrays. The Office 2007 version of Excel is expected to handle larger data sets than the current version, although Lincoln noted that this would still not be as large as scientists typically process using general-purpose desktop data analysis tools, such as SAS's JMP, Spotfire's DecisionSite, Insightful's SPSS, Partek, or R.
Microsoft Partnerships Extend Beyond Alliance
Microsoft has other, less formal relationships with the bioinformatics community that are not associated with the BioIT Alliance. Jim Ostell, chief of the information engineering branch at the National Center for Biotechnology Information, said in a talk at the Bio-IT World Life Sciences Conference and Expo that NCBI is working with the software giant to take advantage of the XML capabilities that will be available in its Office tools, specifically for PubMed, in which documents authored in Word could be automatically converted to XML.
Also, Michael Reich, manager of cancer informatics development at the Broad Institute of Harvard and MIT, said that his team has been working with Microsoft for around four months so that results from the Broad's GenePattern analysis software package can be automatically generated as Word documents.
Scientists already use Excel and other Office applications "habitually," Lincoln said, "Thus, enabling a workflow that can allow at least certain Affymetrix data to seamlessly fit into these tools is clearly worthwhile. This will enhance the usability of GeneChips and make our customers even more productive. "
Neither Lincoln nor Kerlavage identified specific outcomes that they expect from the BioIT Alliance, however.
"We're all talking to Microsoft, that's a great first step, and we'll see what happens from there," said Lincoln..
Likewise, Kerlavage said that AB has not set a "concrete timeline" for its work with Microsoft. "We're doing exploratory work now," he said, noting that the prototypes that the company has developed so far are "consistent with our philosophy of working to build small tools that are easy to use rather than building large applications."
For software vendors in the alliance, Rule stressed that the initiative will pursue projects that offer "opportunities for commercialization." Microsoft sells — and will continue to sell — "big horizontal products," he said, so any specific features targeted to the life science market that arise from the projects in the BioIT Alliance would be available for software partners to package and sell.
"Much more of the biomedical functionality will be fulfilled by the ISVs than by ourselves," Rule said.
— Bernadette Toner ([email protected])