Bioinformatics consultancy 3rd Millennium of Cambridge, Mass., has received a two-year $1.8 million grant from the US National Institute of Standards and Technology to develop enterprise-wide software that will integrate genomic and proteomic data at the level of biological systems.
The company, which intends to commercialize the software at the end of the two-year period, expects to concentrate on shifting the approach to information management from lists of biological objects to networks of biological pathways that link gene function and expression, protein interactions, disease models, and responses to therapies, all in the context of the organs, tissues, and cells in which they occur.
Eric Neumann, 3rd Millennium’s vice president for life science informatics and principal investigator for the project, said that once the human genome is annotated, a key question will be how can researchers use that information.
“We’re going to pull together all knowledge of protein interactions, pathways, and systems phenomenon in living organisms and merge it with genomic and expression information,” said Neumann.
The company, which has not named the system yet, expects to license the technology to genomics, biotech, and pharmaceutical companies including those that generate genomic and proteomic data.
Neumann said he is not aware of any competing technologies that can do what 3rd Millennium envisions. While other integration and platform-independent technologies exist, the company’s planned system will be focused on function rather than concentrating primarily on genes, he noted.
The system will have query, analysis, and shared view capabilities to enable users to get more value out of raw data, said Neumann.
“It’s looking at the interactions of things rather than the sequence here and the protein here. It literally connects a protein with an expression pattern with a disease with the gene model with what tissue it appears in at what time. All that information gets interconnected,” said Neumann.
3rd Millennium is designing the software to be able to handle information on the scale of petabytes so that it will be up to managing the scale and diversity of data that will be generated from proteomics and genomics research.
Some components for the system, such as handling data through an ontology manager, have been built and partially tested. Ontologies are specifications of the concepts in a given field and the relationships among those concepts.
“The ontology element in solving the problem is a key component of what we’re going to be developing. It will help us realize all these goals,” he said.
In the early stages of work under the grant, development of the basic information management system for handling real data from model organisms of increasing complexity, including the human genome, will be addressed.
The system will be layered by other interactive components such as database wrappers, pathway model constructors and analyzers, external management of data resources, intelligent, cross-domain querying, visualization tools, simulations engines, and expert system inference functions that focus on biological cause and effect.
Neumann said that the grant-related development efforts will closely follow the a biopathways informatics standard that is being created by the BioPathways Consortium, a group founded by 3rd Millennium and Paris-based proteomics company Hybrigenics. The two companies announced the formation of this consortium at the Intelligent Systems for Molecular Biology conference in August.
The goal of the consortium is to help establish common computing standards for information analyses related to genome-based drug development by researchers representing government agencies, industry, and academic organizations worldwide. The consortium intends to work closely with other life science informatics efforts.
These standards will enable tools and databases associated with pathways data to be linked in ways that make sense to biological scientists, said Neumann. He noted that oftentimes databases are created by computer scientists, which can lead to misinterpretation by other scientists, due to their different orientations.
Such principles will also be important for the specifications and requirements of computational hardware manufacturers that are focusing on the life science industry, said Neumann.
The grant has been awarded under NIST’s Advanced Technology Program, which was created to fund American companies to pursue research and development projects needed to accelerate the creation of new technologies important to the economy. Affymetrix and CuraGen are previous grant recipients. The NIST is a division of the US Department of Commerce.