For more than a century, the National Institute of Standards and Technology has carefully tended the weights and measures that sustain US industry — from heavy manufacturing to IT. Now, the institute is making plans to stay ahead of the game by building its capabilities in the biological sciences.
NIST believes that “the twenty-first century economy will have a huge biology component, so in house we need to be ready to address concerns that come up during that time,” said Angela Hight Walker, research partnership advisor at NIST. Hight Walker added that the institute is still in the early stages of formulating its life science strategy, “but NIST definitely realizes that this is important, and we have put the groups together where we hope to begin to stake out how we’re going to make contributions.”
Even though the new strategy is still at an embryonic stage, NIST has already identified bioinformatics as one of its focus areas. Noting biology’s “increasing effort to become quantitative rather than strictly classificational or qualitative,” Vince Vilker, chief of NIST’s biotechnology division, predicted that “most of the growth in the life sciences is going to be attributable to improved methods of quantifying and verifying information” — a “natural fit” for NIST, he said. But, he added, one immediate challenge for the institute “is that out of 3,000 staff members, only about 100 have hands-on life science backgrounds.”
To address this problem, the institute created a cross-disciplinary Biosystems and Health working group in 2001 “to articulate how NIST’s strength in physical and informational sciences can be applied to the world of life sciences,” Vilker said. The initiative grew out of a broader ten-year strategic plan at NIST that also identified nanotechnology, information technology, and homeland security as NIST-wide “strategic focus areas” to address through 2010. The Biosystems and Health working group has a budget on the order of $25 million in FY 2004, Hight Walker estimated, adding that the exact budget for NIST’s life science activities is difficult to pin down because some cross-disciplinary efforts are partly funded by the participating labs.
NIST’s long-term commitment to its life science effort is evident in its budget outlook. The institute has requested a $5 million increase in its Biosystems budget for FY 2005, and Hight Walker said that another increase is planned for the FY 2006 budget, which is currently being planned. The working group’s goal, Vilker said, isn’t a short-term fix, but rather a mission to identify “what a suite of bioscience core competencies should look like ten years from now.”
That decade-long timeline hasn’t stopped NIST from jumping into bioinformatics with both feet, however. The institute currently supports a laundry list of bioinformatics projects (see list, this page), including its long-standing involvement with the Protein Data Bank, and plans to move into a number of new areas in the future. One particular focus right now is gene expression analysis — a field in dire need of standardization. NIST is a founding member of the External RNA Controls Consortium — a group working on platform-independent RNA spike-in controls for microarray analysis and RT-PCR, as well as the informatics and data management tools required to support the effort.
Nell Sedransk, a mathematical statistician in NIST’s information technology laboratory, said that the oft-cited variation between different microarray platforms remains a “conundrum” because it’s difficult to determine whether these variations are due to differences in the chips or the informatics — or a combination of both. “Both pieces contribute,” she said. “The algorithms themselves make up just as much of a component of the information that comes back to the clinician as the bench science did, and one of the things that NIST has a long history of doing is evaluating, assessing, devising metrics for, and examining algorithms and examining information in a wide variety of forms.”
Sedransk said that NIST’s information technology lab also views biological simulation as an area in which it can “define best practices and guidelines.” She added that NIST has a long history of working with IT standards bodies, but doesn’t see its role as creating IT standards per se. Rather, she said, the institute develops test suites, protocols, and other resources that can be used as benchmarking tools to gauge conformance with existing standards — a role it intends to play in bioinformatics as well. To date, NIST’s involvement in life science standards has focused on healthcare informatics groups, such as the ANSI Healthcare Informatics Standards Board, Health Level Seven, and ASTM International. NIST is also a member of the W3C, where it is leading an effort to develop conformance tests for XML technologies.
Another area of interest is in semantic standards for biomedical applications. Ram Sriram, group leader in the design process group in NIST’s manufacturing systems integration division, said that his group has identified a number of seemingly unlikely areas of overlap between manufacturing and life science research. An ongoing project to address interoperability issues in CAD/CAM software, for example, led to an interest in semantic-based technology, which also shows great promise for the biosciences, according to Sriram — particularly in areas such as nomenclature disambiguation and ontology development.
The manufacturing group has started a new program called Manufacturing Metrology and Standards for the Healthcare Enterprise, Sriram said, which addresses several bioinformatics issues, such as process models and ontologies for biochemical pathways; testing services for computational biology algorithms; and repositories that “describe the form, function, and behavior of biological entities at various levels of abstraction, like gene, protein, cell, tissue, and organ.”
Vilker noted that “one of the hottest growing areas in the life sciences activity at NIST” is in biomarker validation. NIST currently maintains STRBase, a curated collection of short tandem repeats and other genetic markers. STRBase is used for “everything from looking at migration patterns for the way human beings populated the planet, to specific information taken into a court of law on a paternity suit,” Vilker said, but NIST is expanding its efforts in a partnership with the National Cancer Institute to validate markers before they go into clinical practice.
NIST is also working closely with the Food and Drug Administration, Vilker said, at an activity level that is “two- or three-fold” that of several years ago. The institute is also partnering with the Centers of Disease Control and Prevention on a project to certify genetic testing measurements as predictors for disease, he said.
In addition to its internal evaluation and its partnerships inside the beltway, NIST is reaching out to the broader research community via meetings and workshops. On June 9, NIST will host a workshop entitled, “Bioinformatics Needs for Use of Spike-in Controls in Gene Expression Assays” as part of its ongoing effort with the External RNA Control Consortium. Information on the meeting is available at http://www.cstl.nist.gov/biotech/workshops/ERCC2004/index.html. On June 22, Sriram will co-host a workshop entitled “Using Semantic Standards to Integrate Biomedical Imaging into Clinical Decision-Making,” as part of the NIH BISTI 2004 symposium (http://www.capconcorp.com/becon2004/post_sema.asp).
Eventually, NIST hopes to be as trusted in the biological world as it is in the realm of the physical sciences. “When you go to get gas, the gallon is somehow traceable to NIST in terms of the measurement unit,” said Sriram. “Similarly, in the future, when people measure pathogens and DNA markers, that measurement activity will be traceable to some of the things that are currently underway at NIST.”
Some Bioinformatics Activities Underway at NIST
Automated Processing of Peptide Mass Spectra: NIST’s Physical and Chemical Properties Division is working with NIH and NCBI to develop standards for automated processing of peptide mass spectra.
Biological Macromolecular Crystallization Database: Database for crystallization conditions and crystal data for all forms of biological macromolecules. (http://wwwbmcd.nist.gov:8080/bmcd/bmcd.html).
MitoAnalyzer: a Java program that allows users to determine how a polymorphism affects the resulting protein in human mitochondrial DNA. (http://www.cstl.nist.gov/biotech/strbase/mitoanalyzer.html)
Protein Data Bank: Global repository for processing and distributing 3D macromolecular structure data (http://www.rcsb.org/pdb/).
Short Tandem Repeat DNA Internet Database: A curated database of short tandem repeat DNA markers for human identity testing. (http://www.cstl.nist.gov/biotech/strbase)
Thermodynamics of Enzyme-Catalyzed Reactions Database: A compilation of data on the thermodynamics of enzyme-catalyzed reactions (http://wwwbmcd.nist.gov:8080/enzyme/enzyme.html).