Last year, Intel and AMD began rolling out dual-core processors, and recent benchmarks indicate that most familiar bioinformatics algorithms will see linear speedup on the new chips with minimal tweaking. But early adopters note that they won't offer performance gains in all cases.
Many bioinformatics algorithms, with Blast being the primary example, are able to exploit parallel processing, which is exactly what the new dual-core chips offer — twice as many processing units on a single chip. This feature makes these applications a natural for the new technology, so that a single server with four dual-core CPUs will run just as quickly, if not a bit better, than eight processors in any other configuration.
But while this capability was expected, there were no guarantees that the chips would perform as promised when they first entered the market, according to Chris Dwan, a principal at consulting firm The BioTeam, which recently completed a benchmarking study of Intel's dual-core Xeons running Blast, ClustalW, and MrBayes.
"The basic problem is that you've got the same number of pins at the bottom of the [chip] that are attaching it to the motherboard, so you've got the same number of physical pins serving two processing cores. Naturally, if you look at that, it might be a bottleneck," he said. "But what I saw was that it's not actually limited in that way."
Dwan stressed, however, that the algorithms he used in the study are all available in parallel implementations. "If you've got some program that does not take advantage of parallelism, that's not written in a parallel manner, having a multi-core system is not going to speed that up, necessarily," he said.
"If you've got some program that does not take advantage of parallelism, that's not written in a parallel manner, having a multi-core system is not going to speed that up, necessarily."
Likewise, algorithms that require a lot of memory may not see the same kinds of performance gains as something like Blast. Another recent benchmarking study from Scalable Informatics that evaluated several bioinformatics and computational chemistry algorithms on AMD's dual-core Opterons reported similar results as that of the BioTeam, except in the case of the Amber molecular dynamics suite.
According to the Scalable Informatics study, Amber performed a bit slower in the dual-core implementation because it "consumes significant memory bandwidth," which resulted in a "slight advantage" for the single-core system with independent memory.
Chip manufacturers claim that dual-core systems offer lower total cost of ownership and better performance than single-core systems with the same number of processors. In addition to the advantages for parallel code, the multi-core chips promise to speed multitasking by running concurrent jobs in about the same time it would take to run a single task alone. For developers, manufacturers claim that new processors will reduce compiling times, and will also enable more advanced software features that can take advantage of the multi-threaded architecture.
Furthermore, the dual-core chips offer an option for IT departments that want to increase their computational power, but are running short on space — and cooling power. According to Intel spokesman Bill Kircos, "There is significantly lower power, so there's a net effect of lower electricity bills, and you can pack more servers into a rack than before."
But Dwan said that the biggest beneficiary of the new technology may not be the cluster crowd, but "the bioinformaticist with the workstation at their desk." Currently, he said, "they don't really see much of a difference if a tool exploits parallelism or not because they've only got 2 CPUs in it. When they have eight, it's going to make a big difference, and the difference is that to get that eight, or 16, or whatever CPU power, they're not going to have to go down to the supercomputing center. It's just going to be what ships with their desktop."
Indeed, the question for most users won't be whether to upgrade to the new multi-core systems, but when. Intel and AMD released their first dual-core processors last year, and both firms have ambitious plans for the technology. AMD already sells dual-core versions of most of its chips, with the rest scheduled for release before the end of the year. The company's three-year technology outlook notes that the rollout of chips with "more than two cores" will begin in 2007.
According to Intel's Kircos, "exiting this year, we feel like about 70 to 90 percent of all of our products — desktop to server — will be shipping dual-core, and we'll move to multi-core in the future," although he said that company hasn't set a date yet for when it expects to move to more than two cores.
For the time being, however, the new technology raises questions for both IT managers and software developers.
"This is sort of the new direction for commodity chips," said Dwan. "The interesting things to me are two-fold: One is how would I build a supercomputer out of that? Would I rather have more machines with fewer CPUs, or fewer machines with more? And the other question is, as a developer, a software person, how does that affect how I write my codes?"
On the hardware side, Dwan said, there are pros and cons associated with both options. One consideration, he said is the number of systems that an administrator has to maintain. "If I have a problem that I happen to know requires four CPUs cranking full time, if the choice is [whether to put] all of those in box, or spread them across four boxes, there's more complexity to spreading out across four machines. You've got to provide power to four machines, you've got to maintain a network between them, and all that."
Some Multi-Core Processor Resources
BioTeam's dual-core Xeon benchmarking report
|Scalable Informatics' dual-core Opteron benchmarking report
(NCBI Blast, HMMer, GAMESS, Amber): http://enterprise2.amd.com/downloadables/Dual_Core_Performance.pdf
|Intel multi-core processing resources:
|AMD multi-core processing resources:
On the other hand, he said, "If I put all the eggs in one basket — all the chips in one machine — and that machine goes offline for whatever reason, then I have zero. So if I've got processing spread out across several machines, I might be able to lose one and degrade gracefully."
On the software side, he said, "tools that take advantage of parallel systems are going to get a lot more popular, because they're going to get a lot more powerful."
Intel's Kircos noted that the shift to multi-core systems will present "a fundamental change in how to write software." For the last 30 years or so, "everyone has written software for … faster megahertz, and kind of one instruction at a time," he said. Now, developers will have to adapt to parallel processing.
"It is nothing less than kind of redefining how you write software, and that is definitely a challenge, but it's where everybody's going," he said.
Kircos said that Intel's introduction of hyperthreading technology several years ago helped give many developers a head-start for multi-core programming. Hyperthreading "helped kind of set the table for the multi-core era," he said. The approach, which he described as "tricking the operating system and software into thinking that there already were two cores" offered modest performance gains for single-core systems, "but the kind of hidden agenda around that was, 'OK, we're already going this way, so here's a couple of years in advance of when we ultimately get to two, four, or eight cores.'"
Users in the bioinformatics community and elsewhere are just now beginning to upgrade to the dual-core platforms, and Dwan said that BioTeam's customers are expressing a lot of curiosity about the new chips. "The question is always price/performance — how much bang can I get for my buck on a particular algorithm?"
Dwan said he recommends that customers "definitely look at these things and apply the same sort of thinking that you would have applied anyway. … It's not a total jump in the way systems work — it's exactly like going from one chip on a board to many chips on a board."
Intel's Kircos said that he expects to see disciplines like bioinformatics in the forefront of the adoption curve. Bioinformaticists are "hooked on speed," he said. "They'll rewrite their software fairly quickly to make it multi-threaded, or parallelized. So that can happen quicker in this type of industry, for sure."
— Bernadette Toner ([email protected])