The Trouble With Supercomputers

By Matthew Dublin

Scott Fulton has a lengthy post that explores some of the challenges facing HPC developers and the question of whether or not supercomputers will still matter going forward.

Historically, the "next big thing" in computing has always been something big, literally, where men in white lab coats and Keds would unveil a device that consumed an entire floor of a building which only a few specialized technicians could operate.

But Fulton writes that "for the last two or three generations, the computer, by definition, has been something small. 'The next big thing,' as presented by cool-looking guys in black turtleneck sweaters and blue jeans, has been something you hold in your hand. It's something folks can stare at and be amazed while congregating at the coffee shop."

We're sure he's not talking about Apple here…

Fulton goes on to point out that, although it looks like the performance of supercomputers have steadily increased, the expectations being placed upon them far exceed what their current software and hardware architecture can handle.

A number of HPC experts are quoted in the article, including John McCalpin, research scientist at the Texas Advanced Computing Center. According McCalpin, "the problem is that we want to eat our cake and have it, too…We want machines that are easier to program. On the other hand, we have a tendency to purchase machines based on their peak performance. And as a consequence, we don't provide the economic incentives for any vendors to make machines that are easier to program."

While the greatest breakthrough is supercomputing has been the adoption of commercial off-the-shelf, or COTS, processors, which enables designers to build supercomputers with thousands of stacks of multicore processors. In the pre-multicore era circa 2002, COTS were responsible for less than 10 percent of the Top 500 machines. By November 2010, roughly 84 percent of the compute power on the Top 500 list was comprised of x86/x64 processors. What makes this possible is the development of interconnect technology, but as McCalpin points out in the article, these interconnects cannot address the architectural deficiencies in off the-shelf-processors.

Fulton points to a 2010 report issued by DARPA that says out, traditionally there are essentially four ways to enhance the performance of a COTS-based supercomputer: increasing clock speed of CPUs, decreasing supply voltage supply to enable tighter component integration, increase the number of transistors on the CPU, and the use of interconnects like InfiniBand. But as the report states, "current interconnect protocols are beginning to require energy and power budgets that rival or dwarf the cost of doing computation."