The longer the drug development process, the more expensive a failure. AstraZeneca expects a server farm to speed up target discovery.
When Astra and Zeneca merged in 1999, Simon Weston got stuck dealing with the extra traffic. Weston manages the company’s Alderley Edge, UK, bioinformatics service for 1,000 scientists who analyze blood cells of patients with oncological and gastrointestinal diseases. Their work entails running a series of bioinformatics applications on public sequence data.
Weston says his main post-merger challenge was to handle the surge in genome data without incurring downtime. “We needed to boost our throughput whilst planning a major capital investment in hardware,” he says.
A colleague told Weston about Blackstone Technology Group, a systems integration firm in Worcester, Mass., that specializes in setting up computing farms for intensive research. Under the gun, Weston says he didn’t even evaluate other consultancies: “Instead we looked at Blackstone’s technical approach carefully and commissioned a small piece of exploratory work which had a successful outcome. We also chose them because they are vendor-independent, and they specialize in getting BLAST algorithms to run as fast as possible on a compute farm type of architecture.”
For a fee of $75,000, Blackstone reconfigured Astra-Zeneca’s existing equipment — 36 Compaq Alpha CPUs split among two 10-CPU machines and four four-processor boxes — into a computing farm. The consultants lashed together all of the machines and installed Platform Computing’s Load Sharing Facility software, which distributes processing jobs to available nodes and eliminates idle CPU cycles by scheduling tasks around the clock, rather than only during peak hours.
Weston says computing power has increased without adding any equipment. “We have not completed the benchmarking, but we estimate the throughput increase on BLAST to be four-fold,” he says. “I don’t have precise figures because we are not comparing like with like configuration.”
The facility uses Paracel’s GeneMatcher-Plus to accelerate HMMER and Smith-Waterman algorithms and hosts most publicly available bioinformatics tools and databases as well as a large number of home-grown programs including workflow tools. These applications are part of “an effort to provide an internal informatics platform to manage, analyze, and integrate the publicly available genomic data with other public and proprietary data,” Weston says. “Attaching higher-quality information to genomic data will help reduce the failure rate of drug discovery projects by filtering the burgeoning pool of targets to reveal the winners.”
Next, AstraZeneca plans to invest about $2.8 million in 200 more Alphas, which it will integrate with the cluster. Weston says he chose the Compaq machines for their compatibility with those already in place and with the applications running on them.
“The type of high-performance computing that we will be doing will move more into data mining of complex data that’s more focused on gene expression and the proteome,” says Weston. “The potential is there for 1,000 scientists globally to do data mining, but in reality, there will probably be different degrees of accessibility.”
He notes that he’s currently looking at data mining options and hasn’t selected a vendor yet.
Weston expects to see a return on his technology investment quickly. “When the first drug makes it through the end of testing and goes to market, and the facilities we’re building have contributed to it, then that will pay for it in one go,” he says.
— Jackie Cohen