Linguamatics and ChemAxon said this week that they will collaborate on an EU-funded project to develop text-mining software for chemical research.
The project, dubbed Chemically Informed Knowledge Extraction from Literature, or ChiKEL, aims to build a system that will integrate advanced chemical search capabilities with tools for extracting relationships between structures and other biological or chemical entities.
The project was awarded €671,000 ($870,000) under the Eureka Eurostars program, a European funding initiative focused on supporting small businesses in the research sector.
The collaboration will build on an integration partnership the companies began in 2009, which added Linguamatics' text-mining capabilities to ChemAxon's substructure and similarity searching tools (BI 4/24/2009). The ChiKEL system will recognize novel chemical compounds expressed in a variety of ways, including IUPAC names and images, and will enhance the presentation of search results so that users can view chemical structures and browse clusters of structures found within the documents.
These capabilities will allow users to perform chemical structure and biological searches to extract structured information for further analysis from patents, scientific articles, and internal documents, the partners said.
Additionally, the software will help users find structures in documents where manual mark-up has either not been done or is incomplete or economically unfeasible. It will also let researchers find chemical structures at particular points within a document.
The partners are seeking beta testers for the software. Potential users include pharmaceutical and biotechnology companies and adjacent markets such as food, agrochemicals, and healthcare.