Skip to main content
Premium Trial:

Request an Annual Quote

Researchers in India Plan Bioinformatics Boost With High-Speed Parallel Supercomputer

Premium

An Indian government lab has outfitted the country’s fastest parallel supercomputer for bioinformatics research in an attempt to coax university and industry scientists into pursuing highly computation-intensive problems in biology.

Researchers at India’s Center for Development of Advanced Computing in Pune have tailored popular bioinformatics packages, which were originally developed in US universities, to run on its “Param10000” supercomputer.

The bioinformatics team at CDAC has also developed parallel genetic algorithms for multiple sequence alignment and protein structure analysis.

“We’re trying to make available high-end computing resources for bioinformaticists in India,” Rajendra Joshi, bioinformatics coordinator at CDAC, told BioInform.

The researchers have already ported two molecular modeling packages — AMBER and CHARMM — on to Param10000, a parallel machine constructed out of multiple Sun UltraSparc nodes designed for a peak performance of 100 Gflops.

Param was India’s answer to technology embargos. In the mid-1980s, India was denied supercomputers by the US and Japan on the grounds that they would be used for its nuclear and missile programs.

In response, government labs in India, including CDAC, designed parallel supercomputers, procuring off-the-shelf processors and writing codes that distribute problems across multiple processors.

The Param, installed in over 25 universities and labs across India, is now routinely used in meteorology, seismic analysis, oil prospecting, and fluid dynamics.

“We’re hoping that an opportunity to exploit Param for bioinformatics will encourage scientists here to take up problems they might have shirked earlier,” Joshi said.

CDAC may have itself wrested the first research results from bioinformatics on Param ó an insight into mechanisms that underlie trinucleotide repeats associated with Huntington’s disease, a neurodegenerative disorder.

Joshi used a parallelized version of AMBER to simulate the 3D structure of CAG, the trinucleotide repeat associated with Huntington’s disease. While this trinucleotide repeat occurs up to 35 times in the general population, individuals with Huntington’s disease may have up to 121 such repeats.

“Structural studies suggest that this trinucleotide sequence is kind of predisposed to repeating itself,” said Joshi.

The CAG repeat studies involved simulating the behavior of 16,000 atoms over a nanosecond, a simulation that lasted 96 hours on a 16-processor Param. It could have taken up to eight weeks on conventional single-processor workstations that researchers typically use in India.

Param’s configuration would depend on the problem to be solved. An eight-node cluster with 32 processors would be good enough for simulations involving protein molecules, DNA-protein complexes, and drug molecules bound to proteins.

The CDAC team is also working on parallel codes for genome sequence analysis. And a parallel genetic algorithm that CDAC bioinformaticist Lourdusamy Anbarasu has developed is “qualitatively better” at multiple sequence alignment than ClustalW or sequential genetic algorithms, according to Anbarasu.

A special user interface for bioinformatics on Param will allow scientists to work on the system without having to learn parallel processing. “For most scientists, it will be business as usual, but a lot faster,” Joshi said.

— GM

 

Filed under

The Scan

Myotonic Dystrophy Repeat Detected in Family Genome Sequencing Analysis

While sequencing individuals from a multi-generation family, researchers identified a myotonic dystrophy type 2-related short tandem repeat in the European Journal of Human Genetics.

TB Resistance Insights Gleaned From Genome Sequence, Antimicrobial Response Assays

Researchers in PLOS Biology explore M. tuberculosis resistance with a combination of sequencing and assays looking at the minimum inhibitory concentrations of 13 drugs.

Mendelian Disease Genes Prioritized Using Tissue-Specific Expression Clues

Mendelian gene candidates could be flagged for further functional analyses based on tissue-specific transcriptome and proteome profiles, a new Journal of Human Genetics paper says.

Single-Cell Sequencing Points to Embryo Mosaicism

Mosaicism may affect preimplantation genetic tests for aneuploidy, a single-cell sequencing-based analysis of almost three dozen embryos in PLOS Genetics finds.