Skip to main content

Paper Calls for Broader Use of AI-Based Methods in Bioinformatics

NEW YORK, Sept. 18 - Future understanding of genomic data may be severely limited unless bioinformaticists gain a better understanding of knowledge representation, according to Peter Karp, director of SRI International's Bioinformatics Research Group.

In a paper that appeared in the September 14, 2001, issue of Science , Karp outlined the role that symbolic computing and artificial intelligence would play as bioinformatics begins to address biological systems and scientific theories. Karp wrote that systems biology theories are quickly growing too complex to be understood by individual scientists. In response, he argued, bioinformaticists will have to turn to artificial intelligence methods that will enable computers to verify a theory's internal consistency, its global properties, and its consistency with external data. 

In the paper, Karp highlighted SRI’s EcoCyc Project ( ) as an example of an effective use of AI-based methods in biological research. EcoCyc is a symbolic pathway database that describes the metabolic, transport, and genetic-regulatory networks of Escherichia coli.  EcoCyc is structured according to an ontology, or database schema, that captures semantic distinctions and precisely defines the meaning of different database fields. This structure provides an interconnected web of frames stored in a frame knowledge representation system that enables computer-based reasoning across the network. 

According to Karp, this holds a significant advantage over conventional bioinformatics approaches, which are typically text-based repositories of theories and data. “Although the scientific community clearly accepts the   need to encode the ever-expanding quantity of scientific data   within databases,” Karp wrote, “databases of scientific theories, such as a theory describing   the transcriptional regulation of E. coli genes, are much rarer.”    

As biological research grows more and more dependent on information technology to make sense of increasing amounts of genomic data, Karp wrote, it will be crucial for bioinformaticists to keep up with new developments in symbolic computing. “The genome revolution is increasing the need   for pathway databases in the biological sciences, and similar developments   will occur in other sciences. However, effective implementation   of this paradigm is hampered because most biologists (and most   other scientists) receive essentially no education in databases or knowledge   representation.” 

According to Karp, equipping scientists with a better understanding of knowledge representation   concepts--such as data models, ontologies, database query languages,   logical inference, database design, and formal grammars--will be necessary in order to carry the field forward. 

EcoCyc was developed using SRI's Pathway Tools software environment, which encodes systems biology theories and supports query, analysis, and visualization operations for pathway and genome databases. SRI is currently applying Pathway Tools to a number of other organisms.  

The Scan

Pfizer-BioNTech Seek Full Vaccine Approval

According to the New York Times, Pfizer and BioNTech are seeking full US Food and Drug Administration approval for their SARS-CoV-2 vaccine.

Viral Integration Study Critiqued

Science writes that a paper reporting that SARS-CoV-2 can occasionally integrate into the host genome is drawing criticism.

Giraffe Species Debate

The Scientist reports that a new analysis aiming to end the discussion of how many giraffe species there are has only continued it.

Science Papers Examine Factors Shaping SARS-CoV-2 Spread, Give Insight Into Bacterial Evolution

In Science this week: genomic analysis points to role of human behavior in SARS-CoV-2 spread, and more.