With the goal of expanding the distribution of the open-source Bio-Formats software for image file format conversion, Glencoe Software, an imaging software company established by founders of the Open Microscopy Environment project, said last week that it will be providing commercial licenses for software developers seeking to work with the format library.
Glencoe and the Laboratory for Optical and Computational Instrumentation at the University of Wisconsin at Madison, another OME member and the developer of Bio-Formats, said that the deal will keep the software open source while allowing commercial organizations to redistribute it.
“This is a mechanism to deliver Bio-Formats to developers or software companies who don’t want to use the GPL license,” Jason Swedlow, president and co-founder of Glencoe Software, told BioInform. Swedlow is also a professor and Wellcome Trust Senior Research Fellow at the University of Dundee, and a founder of the OME project.
Glencoe, established in 2005 by Swedlow and another OME co-founder, Peter Sorger of Harvard Medical School, serves as the licensing and customization arm for OME’s OMERO image-management and -analysis platform. The company’s Data Management Engine, or DME, is a version of OMERO for customers who want to use OMERO for commercially licensed applications, or who need the software customized.
The company’s customers include Applied Precision, Rockefeller University Press, and PerkinElmer.
Redistributing the Code
Swedlow said that distribution was the main impetus for the licensing deal, and is expected to appeal to developers who are creating novel applications with Bio-Formats and may want to distribute their work as a closed-source application.
The commercial license for Bio-Formats is not necessary for those pharmaceutical or biotech clients using the software for their research, but is expected to be of interest to developers who want to build on top of the library, but don’t want to release their derivative software under the GPL license.
“If you want to deliver a closed source application as part of a contract with a customer and you want to use Bio-Formats, this allows you to purchase a commercial license for a closed version of Bio-Formats to sell on to your customer,” he said. The license also gives his firm a way to provide a tested, supported version of Bio-Formats with a warranty, said Swedlow.
Typical licensees may be imaging-software companies who need Bio-Formats to read a variety of file formats. The fact that Bio-Formats is “an actively developed and supported project” should benefit these customers, Swedlow said. “We’re doing the work of chasing the updates to the file formats. As a developer you don’t have to do that work.”
While the University of Wisconsin licensing office could have licensed the open-source tool to commercial parties, Swedlow believes Glencoe “adds value” beyond a typical license. “It’s mostly experience, know-how, [the ability to] support,” he said. “The know-how in terms of delivering and, if needed, customizing [Bio-Formats] exists with us.”
Glencoe is “an ideal partner” for creating image informatics solutions, Kevin Eliceiri, LOCI’s director, told BioInform. “Companies want to buy support, the help to integrate it,” he said. Buying the license from Glencoe is not just acquiring the right to use it in a closed-source software tool, he said.
Building a Library
Bio-Formats is a Java library developed at LOCI that lets scientists read and write approximately 60 proprietary image file formats.
LOCI, Glencoe, and Swedlow’s group at Dundee belong to the OME consortium, formed in 2001 by a group of laboratories around the world dedicated to open the lines of communication in light microscopy by, for example, promoting universal file formats.
OME developed OME-TIFF, which builds on a data model called OME-XML. This format was intended to make it easier for researchers to migrate images and metadata.
“We’re doing the work of chasing the updates to the file formats.”
Eliceiri and programmer Curtis Rueden initiated the Bio-Formats project and approached OME to collaborate. “We immediately agreed, because what they were doing was a critical task in the field of imaging,” said Swedlow.
When the collaboration began, Swedlow said he saw Bio-Formats as one of the “critical parts of OMERO,” said Swedlow. “One of our developers curates the OME data model, which is essentially an expression of all the different metadata relationships in a microscope image,” he said. “That is the foundation for everything the OME Consortium does.”
Swedlow said his group at Dundee has been putting in development time to integrate Bio-Formats and OMERO, and Glencoe has just added a full-time software developer who solely works on Bio-Formats.
“Bio-Formats takes [the OME] data model and uses it to essentially populate data elements reading from proprietary file formats, and then OMERO takes Bio-Formats and uses that as an interface to pull in data from proprietary file formats,” Swedlow said.
Earlier this year, he said, Glencoe made the “strategic decision” that Bio-Formats was a “critical tool for its long-term interests” and has been devoting resources to it.
Working the Tower of Babel
Eliceiri said that he and his colleagues at LOCI began developing Bio-Formats because they were dealing with “a big Tower of Babel of all these file formats and they don’t communicate or interact with each other.”
Acquiring an image on a microscope collects the metadata from that system, “which is anything but your pixel data,” Eliceiri said. It is possible to strip all the metadata from a proprietary file format to just read the pixels, ignore the proprietary metadata schema, and “still get my pretty image,” he said.
However, he noted, “times have changed, and with a lot of experiments you really can’t understand that pretty picture without knowing the metadata.”
As a result, “what we do in Bio-Formats is try to be incredibly faithful and reproduce [the metadata].”
Beyond being a file opener, Bio-Formats maps every field to the open standard, OME-XML. “That’s where the hard work comes into play,” Eliceiri said, adding that it is this mapping that delivers a standard metadata schema.
This mapping strategy opens up commercial opportunities on several levels, he said. For example, an image analysis software developer might have “terrific algorithms” but the software tool might only support a few file formats. “If they license Bio-Formats, they could immediately support 60 [file formats],” he said.
Another option might be for a hardware company, because Bio-Formats could be integrated to write OME-XML files. “For customers [who] have been complaining about proprietary formats, the company could say, ‘Now I am offering a solution that still uses a data model that I know but writes it in a common interchange that you can share freely,’” Elicieri said.
Elicieri said that several image-analysis and microscopy companies whom he declined to name had approached his lab asking to use the software. “They said the GPL license hinders [them] and they’d also like increased support,” he said.
“Glencoe’s model is it takes this open-source foundation of knowledge, specifications, and software produced by the OME Consortium and wraps that up in a supported, customized form that can be passed on under a commercial license to a customer,” said Swedlow.
“We’re growing pretty well,” he said, adding that the company hopes “to grow two-fold” in 2009 over 2008, but did not provide financial details.
Glencoe’s first customer was Applied Precision, which began integrating Glencoe DME with its imaging system Deltavision in 2006. Then in 2007 Rockefeller University Press applied Glencoe DME to develop a way to let subscribers of the Journal of Cell Biology view and search multi-dimensional data. The journal this week launched the DME-based browser-based application, called JCB DataViewer.
More recently, PerkinElmer Cellular Technologies worked with the firm to develop its Columbus platform for high-content screening and microscopy-image storage, -management, and -analysis.
Columbus incorporates the OMERO Image Database and OMERO insight data-visualization tool, Martin Daffertshofer, global software product manager at PerkinElmer Cellular Technologies Germany, told BioInform in an e-mail.
“We chose to go in this technology direction in order to create products and offerings that provide a maximum of flexibility across standards, to accommodate users of all HCS instruments and software,” he said. “PerkinElmer is very supportive of open-source efforts, in particular the Open Microscopy Environment collaboration.”
Swedlow explained that an increasing number of organizations are using the OME data model, including Applied Precision, Bitplane, SVI, Leica, and Improvision. “It’s really picking up,” he said. Further adaptations will depend on the needs of the community, he said. “The community should decide what they need.”
Bio-Formats reads about 60 file formats and is not limited to light microscopy imaging. For example it reads Gatan Digital Micrograph’s .dm3 and FEI’s .img formats, which are for electron microscopy; DICOM’s .dcm and .dicom file formats for medical imaging; Li-Cor L2D’s file format l2d; the .tif and .scn 2D gel image formats; and high-content screening formats such as PerkinElmer’s .flex format and GE InCell 1000 .xdce and .tif formats. “We see Bio-Formats extending its coverage into many different domains,” Swedlow said.
Bio-Formats “is extending to be an image format tool,” said Eliceiri. He believes that the quickly growing imaging field might not allow one universal image file format to emerge. “Everyday there is a new file format we become aware of,” he said. “We’re supporting 60 but there are another 40 we need to go out and support … probably more than that.”
The current list of file formats that Bio-Formats can read can be found here.