Turns out that Microsoft isn’t only concerned with viruses of the electronic variety. In March the company announced that machine-learning software developed within its research labs may help design more effective vaccines to fight HIV.
In a collaboration with the University of Washington and Australia’s Royal Perth Hospital, computer scientists at Microsoft Research are adapting software originally designed for computer vision and spam blocking to sift through large genetic data sets in order to identify features that may lead to improved vaccines.
The program the researchers are using, called Epitome, was designed by Microsoft’s Nebojsa Jojic to condense information by identifying areas of “self similarity” in large data sets. This capability was originally created to enable Microsoft’s e-mail software to differentiate spam from legitimate messages. Adapting the software to HIV research could eventually address the extreme variability of the virus, which has been one of the persistent challenges of AIDS research.
Jim Mullins, chair of the department of microbiology at the University of Washington, says that the Epitome software has allowed his group to compress the information in the database they were searching against, and then add additional information to describe the variations that exist within the database.
The upshot is a 10-fold speedup in the ability to filter patient data, according to Simon Mallal, executive director of the Royal Perth Hospital’s Center for Clinical Immunology and Biomedical Statistics.
Several vaccine models developed using the approach are currently undergoing wet-lab validation. Jojic says he expects to complete the first phase of lab testing in around six months, but adds that preliminary tests “have verified some of our assumptions.”
— Bernadette Toner
US Patent 6,850,876. Cell based binning methods and cell coverage system for molecule selection. Inventors: Raymond Lam, William Welch, Sidney Young. Assignee: SmithKline Beecham (Now GlaxoSmithKline). Issued: February 1, 2005.
Protects a cell-based or data-driven binning method for providing a representative set of compounds for high-throughput screening. A chemical space coverage criterion measures the uniformity of coverage of the molecules selected, and a fast exchange design algorithm minimizes the number of searches of the candidate points while maximizing the number of exchanges during each pass through the candidate points.
US Patent 6,850,846. Computer software for genotyping analysis using pattern recognition. Inventors: Eugene Wang, Teresa Webster. Assignee: Affymetrix. Issued: February 1, 2005.
Covers methods, systems, and computer software products for determining the genotype of a sample using a plurality of probes. In one version, a tentative genotype call is made based upon the relative allele signals. Pattern recognition is then used to validate the tentative call. Preferred methods for determining the similarity of probe intensity patterns include evaluating the linear correlation coefficient between probe intensities, and accepting the tentative genotype as the genotype of the sample if the linear correlation coefficient is greater than a threshold value.
Even as traditional genomics-based software vendors see their revenues decline, thus far in 2005, three bioinformatics companies — GeneGo, Jubilant Biosys, and Ingenuity Systems — have announced a total of eight agreements for their pathway informatics tools.
As planned, NCBI retired the public LocusLink website on March 1. Standard URLs to LocusLink will now be redirected to Entrez Gene, and the files at the LocusLink ftp site have been moved or copied to the Archive subdirectory.
Gene Codes Forensics has begun working with the Minister of the Interior of Thailand to apply the company’s DNA analysis software, called M-FISys (Mass-Fatality Identification System), to identify human remains from the recent tsunami in South Asia.
SRI International will serve as a subcontractor on a $12.7 million grant awarded to Lawrence Berkeley National Laboratory by NCI to develop tools for predicting cancer therapy response. SRI’s subcontract, worth $1 million, is to apply its Pathway Logic software to develop a model of cellular signaling networks related to human breast cancer.
NIEHS will use a Cray XD1 supercomputer for research in protein-structure analysis and molecular modeling. Cray will install the system in the first quarter of 2005 in an NIEHS facility in Research Triangle Park, NC.
NIH is outlicensing two bioinformatics analysis methods developed by scientists at NHGRI. The first method is a general approach to using supervised artificial neural networks to classify diseases. The second specifically uses ANNs to classify cancers using gene expression data.