Skip to main content
Premium Trial:

Request an Annual Quote

Bioinformatics Briefs: Dec 24, 2001



IBM added to its stable of proteomics partners last week in a deal that makes it GeneFormatics’ preferred information technology and services partner. As part of the deal, IBM made an equity investment in GeneFormatics’ Series C round of financing.

The amount of IBM’s investment was not disclosed.

IBM will supply GeneFormatics with a cluster of IBM eServer xSeries 330 systems running Linux; its “Shark” Enterprise Storage Server, and the DB2 Universal Database. According to the companies, “future directions” for the partnership will include IBM’s DiscoveryLink data integration software.



Boehringer Ingelheim Canada, based in Laval, QuÈbec, has signed onto a research collaboration based on San Diego, Calif.-based Structural Bioinformatics’ Variome structural pharmacogenomics database.

Boehringer Ingelheim intends to use the database, which contains more than 100,000 variant, 3D protein structures, for in silico drug discovery of HIV protein targets. The company expects to use information about mutated HIV protein structures in the design of anti-HIV drugs.

“Our initial focus has been on the creation of the world’s most comprehensive structural variant database for the HIV protease and reverse transcriptase,” said Edward Maggio, CEO of SBI, in a statement. “We expect to move forward into anti-bacterial targets next.”


The Science Advisory Board has completed a study on protein databases as part of its recent series on protein science and proteomics. The study surveyed 462 scientists who use protein databases and proteomics software.

The study found that almost three-quarters of protein scientists currently use databases in their protein research and predicts that this number will increase by 11 percent over the next year.

Surprisingly, the results of the study indicate that only 29 percent of protein scientists currently rely on additional proteomics software to filter, cluster, or analyze proteomics data.

A “snapshot” of the study results is available at



Cleveland-based NetGenics has developed a new set of generic wrappers for IBM’s DiscoveryLink data integration technology that it says will speed implementation of the middleware at client sites.

IBM subcontracted a large portion of the wrapper writing process to NetGenics several years ago.

NetGenics said that the wrapper development cycle, which currently takes three to six months of full-time programming, has been shortened to approximately two weeks by building an XML-based generic wrapper that enables quick integration of disparate data sources with DiscoveryLink.



The International Society of Computational Biology has released a position statement on accessibility to data generated through global expression analysis (GEA) techniques, which include cDNA arrays, oligo arrays, and Serial Analysis of Gene Expression.

“The ISCB believes that all data sets supporting scientific publications should be publicly accessible in their entirety. As standard formats and public repositories are developed for data, journals and funding agencies should require submission of data to public repositories,” according to the society’s motion.

Russ Altman, president of the ISCB, noted in an e-mail accompanying the statement that, “To evaluate GEA papers and place them in context, the full data set must be accessible to the scientific community. Further, many forms of analysis (e.g. clustering and classification) depend on access to the full data set and cannot be replicated on partial data sets.”

Altman warned that a similar situation might soon arise for the analysis of the massive amount of data produced by other technologies such as mass spectroscopy and yeast two-hybrid screens.



Virtual Genetics said last week that AstraZeneca’s medicinal chemistry department is using its Virtual Predict data mining software to predict the water solubility of drug candidates.

Virtual Predict uses several inductive logic algorithms to find patterns in large data sets and describes them through rules that are created automatically by the system by using a training data set. The software can tune the rules and then measure the predictive precision of the model, according to the company.

Filed under

The Scan

Suicidal Ideation-Linked Loci Identified Using Million Veteran Program Data

Researchers in PLOS Genetics identify risk variants within and across ancestry groups with a genome-wide association study involving veterans with or without a history of suicidal ideation.

Algorithm Teases Out Genetic Ancestry in Individuals at Biobank Scale

Researchers develop an algorithm known as Rye to tease apart ancestry fractions in admixed individuals at a biobank-scale, applying it to 488,221 UK Biobank participants in Nucleic Acids Research.

Multi-Ancestry Analysis Highlights Comparable Common Variants at Complex Trait-Linked Loci

Researchers in Nature Genetics examine common variants implicated in more than three dozen conditions, estimating genetic effect similarities across ancestry tracts in admixed individuals.

Sick Newborns Selected for WGS With Automated Pipeline

Researchers successfully prioritized infants with potential Mendelian conditions for whole-genome sequencing or rapid whole-genome sequencing, as they report in Genome Medicine.