Skip to main content
Premium Trial:

Request an Annual Quote

Selventa Partners with Linguamatics to Add Text Mining to Knowledge-Extraction Framework


Selventa said this week that it has partnered with Linguamatics to add natural language processing-based text-mining capability to its knowledge-extraction and discovery platform.

The partners will use Linguamatics' I2E text-mining technology to automate the extraction of life science information from the literature. This data will then be converted into Selventa's structured biological expression language format, called BEL, so that it can be used to interpret large-scale experimental data.

In a statement, David de Graaf, Selventa's president and CEO, said that the partnership "will enable rapid yet comprehensive investigation of new areas of biology by extracting computable knowledge from unstructured text." He added that the alliance could benefit next generation sequence analysis, for example, where "well-structured information for reasoning has been limited."

Furthermore, the combined platform could be used in "drug development decisions in areas such as translational medicine and clinical proof-of-concept stages," de Graaf said.

Selventa's BEL is a structured language for representing scientific findings along with supporting contextual information. The company uses the language in its discovery platform, but said earlier this year that it planned to release it as an open source framework (BI 3/4/2011).

In April, the company announced that it was working with Pfizer to establish a community of stakeholders and users and to develop a portal for disseminating BEL and the BEL framework to the scientific community (BI 4/1/2011).

The first version of the BEL Framework Community Edition is planned for the first quarter of 2012. Users can sign up for accounts on the site to get more detailed documentation about BEL, participate in a community section, and stay informed on the progress of the launch.

Linguamatics' I2E platform provides NLP-based capabilities to help users identify and extract relationships in unstructured text that can be used in biological investigation and analysis.

The system also lets users go back to the source of the data and extract more information as needed.