Four genomics companies plus software maker Molecular Simulations have banded together to develop a next generation set of bioinformatics tools.
In addition to MSI, members of the functional genomics consortium include Hyseq, Cytoclonal Pharmaceutics, Genaissance, and Grupo de Estudios Multicentricos en Argentina, a group of science and technology organizations in that country.
Members of the consortium will have exclusive rights for three years to the new resources, which after August 2003, will be opened up to the public. According to David Edwards, MSI’s product manager for protein science, active discussions are ongoing with other prospective members.
Rights to first use of the consortium’s new tools will give members a better shot at unknown new protein targets. Dorit Arab, executive vice president for drug design at Cytoclonal, said her company intends to develop new leads directed at cancer, viral disease, and the apoptosis pathway using Cytoclonal’s QCT methodology, which uses quantum chemical methods to design inhibitors of enzyme mechanisms. Cytoclonal’s Oasis library of antisense compounds will also benefit from discoveries turned up by the consortium, she said.
The group’s core resources are Gene Atlas, the suite of analytic tools that is being applied to public sequence to generate annotations, and Atlas Base and Genetics Computer Group’s SeqStore as the databases that hold the results. AtlasBase manages protein and protein structures and SeqStore handles DNA sequences.
MSI is looking at integrating other GCG modules into other MSI offerings. GCG was recently acquired by MSI’s parent company Phamacopeia for $27 million following the breakup of Oxford Molecular Group.
According to Scott Kahn, MSI’s vice president of life sciences, this will broaden their utility, so that chemists and biologists can use the same tools to look at data from different perspectives.
AtlasBase differs from existing annotation databases such as Swiss- Prot and Trembl in important respects, said Edwards. First, it contains the entire human genome. Second, the consortium members are giving strong emphasis to integrating structural information into it. A handful of smaller companies, including GeneFormatics, Structural Bioinformatics, and Quest Pharmaceuticals, are also building genome databases that have structural emphasis.
One of the consortium’s immediate goals is to convert AtlasBase, at present a traditional GenBank flat-file, to a relational format, which will facilitate cross-referencing of structural information, said Andrew Peek, a scientist at Cytoclonal. Peek explained that protein families that vary in primary sequence but share similar three-dimensional structure will be mapped to a common template-like structure.
“The relational format will facilitate cross talk between different fields of the dataset, and if a modification is made to the structure, the relational format will make it easy to trace its effect through all members of the class,” he said.
The Gene Atlas suite of tools includes MSI’s array of individual tools, such as Cerius2, Catalyst, and Insight II, which can be spliced together in many different configurations according to consortium members’ R&D objectives. In addition to off-the-shelf sequence, chemical, and protein tools, new ones being brought in include Modpipe, a method developed by Andrej Sali of Rockefeller University for classifying and predicting protein folds, and SeqFold, from David Eisenberg’s lab at University of California, Los Angeles.
MSI’s Edwards said that consolidating many sequence-based and 3D analyses in one place increases the amount of genome that can be annotated by as much as 15 percent with a corresponding improvement in accuracy. Structural information is the key to making this happen.
Present homology-based analyses miss some kinds of relationships that another method might pick up, he explained. “For example, PSI-BLAST is poor at recognizing homology relationships in the Ig superfamily, as a recent paper in the Journal of Molecular Biology by Michael Sternberg pointed out. You might miss appropriate targets in something like CD-4, but the 3D perspective can compensate for this weakness.”
MSI’s history of building customer relationships based on the consortium model goes back at least a decade, with earlier precedents in combinatorial chemistry, polymers, catalysts, and other industry areas.
According to Kahn, by doing business this way “we get close working relationships with leading software users and developers in the field.” For their part, member companies benefit from belonging to a “unique forum that allows them to collaborate in a pre-competitive fashion.” Membership is on an annual subscription basis, with two levels available. In one, MSI works up member company data and sends it back to them, and in level two, MSI provides tools to the member company so that they can do it themselves. In either case data remains the exclusive property of the member company.
The consortium, which was formed in August, is further evidence of the growing importance of functional genomics in the bioinformatics community. Other recently formed groups include New York University’s functional genomics center and the US National Institutes of Health’s Heart, Lung, and Blood Institute’s $37 million initiative to target genes involved in heart, blood, lung, and sleep-related diseases, which was announced in October.