Skip to main content
Premium Trial:

Request an Annual Quote

Scientists Debate Publishing Future


At the Intelligent Systems for Molecular Biology conference, researchers weighed in on the role that bioinformatics tools will play in the future of scientific publishing.

During the conference, several speakers discussed new text-mining tools and other methods for extracting information from scientific papers. For example, Carnegie Mellon's Robert Murphy spoke about his work on parsing information about images from the biological literature. He and his colleagues have developed SLIF, Subcellular Location Image Finder, a platform that can extract information from captions and figures containing fluorescence microscopy images.

Murphy stated in his presentation that he believes these types of systems will become more widespread but in the meantime it will be important to find ways to "improve practices of defining content," which would make text- and image-mining easier.

One notion he described is the idea of "structured digital caption" that would not show up in the printed paper but would be encoded in the XML file that describes images that are part of a publication. "That makes the parsing of a figure easier," he said.

Yale's Mark Gerstein favors the idea of linking databases and journal articles so that scientists can track a given gene annotation in a database back to the published paper. "There is no good framework for browsing through the genome in the framework of publications," he said. While loading that information into one monolithic database may not be possible, federated queries across structured, ontology-oriented abstracts could help, he suggested.

He said his group "is very keen on this idea of structured abstracts" as a small start to enable a connection between journal articles and databases. Since authors already write the abstracts, this concept would ensure the high quality of the machine-readable abstract. Those texts in turn could be the training set for a more large-scale machine reading project.

Vivien Marx

Bioinformatics Notes

Biomax Informatics will integrate a toxicogenomics database developed by the Mount Desert Island Biological Laboratory with its data-management platform. The MDI database system includes information about cross-species interactions between genes, chemicals, and  proteins that can be used to study disease susceptibility and diseases that are influenced by the environment.

Simulations Plus has signed a multi-year collaboration with Roche, which will provide funding and feedback in developing the firm's GastroPlus software program. Simulations Plus will collaborate with Roche scientists to advance the capabilities of GastroPlus to simulate drug-drug interactions. Roche is slated to provide funding for the equivalent of one full-time scientist for two years.


$3 million
Compugen's net loss for the quarter ending June 30, 2008, including a non-cash expense of $416,000 related to stock-based compensation.

Funded Grants

$1,200,000/FY 2008
EDAC: ENCODE Data Analysis Center
Grantee: Ewan Birney, European Bioinformatics Center
Began: May 15, 2008; Ends: Mar. 31, 2012

This proposal aims to facilitate the integration of data from multiple sources using sophisticated statistical models and machine learning techniques to build integration methods combining datasets. Birney and his team will also use this grant to provide quality assurance and summary metrics of genome-wide multiple alignments. Overall, they aim to provide deep integration of the ENCODE data, under the direction of the AWG and in tight collaboration with the other members of the ENCODE consortium.

$273,906/FY 2008
Adaptive Personalized Information Management for Biologists
Grantee: William Cohen, Carnegie Mellon University
Began: Jul. 11, 2008; Ends: May 31, 2012

This funding will enable the development of an adaptive information management tool. Cohen and his team intend to exploit recent advances in machine learning and database systems in order to facilitate their scheme for loosely integrating both structured information and unstructured text, and then querying the integrated information using easily formulated similarity queries.

The Scan

Expanded Genetic Testing Uncovers Hereditary Cancer Risk in Significant Subset of Cancer Patients

In Genome Medicine, researchers found pathogenic or likely pathogenic hereditary cancer risk variants in close to 17 percent of the 17,523 patients profiled with expanded germline genetic testing.

Mitochondrial Replacement Therapy Embryos Appear Largely Normal in Single-Cell 'Omics Analyses

Embryos produced with spindle transfer-based mitochondrial replacement had delayed demethylation, but typical aneuploidy and transcriptome features in a PLOS Biology study.

Cancer Patients Report Quality of Life Benefits for Immune Checkpoint Inhibitors

Immune checkpoint inhibitor immunotherapy was linked in JAMA Network Open to enhanced quality of life compared to other treatment types in cancer patients.

Researchers Compare WGS, Exome Sequencing-Based Mendelian Disease Diagnosis

Investigators find a diagnostic edge for whole-genome sequencing, while highlighting the cost advantages and improving diagnostic rate of exome sequencing in EJHG.