SAN LEANDRO, Calif.--In the conclusion of BioInform's exclusive two-part interview with Dennis Smith, senior vice-president and chief scientist for bioinformatics software provider MDL Information Systems here, he discusses recent trends in bioinformatics software and some of the company's strategic directions for the future.
BioInform:, Are you seeing annotation arise as an issue with your customers?
Smith: I define annotation as commentary added to the basic raw data, be they DNA or protein sequences. That commentary presumably is related not just to the standard literature references, experimental conditions, and so forth--all of which are important and should be saved for future reference--but it also includes commentary necessary to help make decisions, generally in the form of functional hypotheses, research commentary, what's been observed in the lab, and so forth. Annotations are the key mechanism for trying to convert initial data into information. Raw sequences don't mean much to anybody. Annotation, which can be done partly automatically, partly by interactive analysis, is the key to understanding what each of these sequences might be, what role a putative protein might play in a biological process. The systems we put together can be viewed as basically annotation mechanisms that are making it possible for people to run automated or end-user-based tools and to save the results, so that others in an organization can make use of them.
We've found in many organizations that the same sequences are analyzed over and over again by lots of different people because they simply don't know the other people are interested in them. A lot of duplicate work gets done. We've tried to put in place systems that capture the results so that the next person looking at a particular piece of data can see the annotations and see what's been done on it, and not have to redo work. In fact, they may be able to get some ideas and do additional work, in turn saving that back in the database.
BioInform:, How is MDL's bioinformatics staff integrated into the product development team?
Smith: Because of the nature of our relationships with our outside partners, Molecular Informatics and the University of Illinois, the team we have at MDL, which is approximately 10 people, is responsible for coordinating our worldwide sales and support activities. That group works in turn with our development partners, working together to define the next important things to do from a development perspective. So if the question is asked in terms of development, the aggregate number of people at Molecular Informatics and Illinois is probably around 35 people today involved in development, with our team of 10 at MDL coordinating the MDL organization worldwide to support the products.
BioInform:, Tell me more about how MDL is collaborating with its partners.
Smith: A little bit of history. When we, working together with our customers, began to realize the strategic importance of genomics in the research process, and because our customers were looking at using that information to improve that research process, we made a decision to partner and not to build yet another bioinformatics system. The world has too many incompatible systems today. We chose to identify what we thought were best-of-breed partners in various areas and work with them to get products and solutions to our customers.
In the case of Molecular Informatics, they have a long history of working on database management approaches to sequence data management, and we chose to work with them for some of the basic database and analysis pipeline tools, to provide that portion of the solution to customers as the BioMerge System. Similarly, we chose to partner with the University of Illinois at Urbana-Champaign. Professor Subramaniam at the National Center for Supercomputing Applications and his group had done what I think is really excellent work, taking a variety of incompatible analysis tools and databases and pulling them together under a consistent, web-based user interface, so they were easy for the common man to use without being a Unix expert. The result is the Bioinformatics Workbench.
Again, those are things that can be built, but we felt it was better to take advantage of some technology and product that existed, in trying to respond to our customers' requirements quickly. And, by the way, we have integrated the two products so that the Bioinformatics Workbench can work with the BioMerge database system. Together they represent at least part of the kinds of systems people need to help manage this information.
BioInform:, Will MDL be pursuing other collaborations in the future?
Smith: We certainly don't rule that out. In terms of some of the basic foundation systems and technologies people need to have in place, BioMerge and the Workbench represent, I think, excellent examples that we wouldn't necessarily be looking to supplement. The BioMerge system is, in fact, built on a conventional relational database management platform and is quite open to extension and to interfacing to a variety of different tool sets. So we're encouraging other people to work with us, and we'll probably do that through developers' relationships, rather than through actual partnerships, just to be sure that if companies choose one set of tools or another, they'll all be able to talk to a BioMerge system.
BioInform:, MDL was acquired by Reed-Elsevier in April. What's that going to mean for the company?
Smith: Unfortunately I can't talk about details of projects we're doing, but I can talk about the general direction. The relationship with Reed-Elsevier has really focused on what has been called, appropriately or not, biblioinformatics, where the data source becomes what I would call relatively unstructured scientific information. Obviously in the case of Elsevier Science, they have an extensive body of scientific publications, all of which they have stored electronically, and they have made and will be making some of this information available in electronic form. With that relationship we think, and are working on demonstrating the fact, that systems can be built that are able to manage the cheminformatics and bioinformatics data, and also tie those results into available literature in the biblioinformatics sense, so that a scientist working at a desktop can get not just a static reference to a particular chemical structure or a particular piece of sequence data, but can use those data to link to that publication and actually view the publication and think about it and use that information--obviously much more detailed than the abstract alone--to help make decisions. That, I think, is going to be an increasing requirement, and it is possible to deliver that kind of information to desktops today. It's a matter of having the right relationships in place, and with Reed-Elsevier we think we've got a start.