Supercomputers & the "Energy Wall" Crisis

By Matthew Dublin

In 1976, the Cray-1 supercomputer stood at six-feet tall and was the most powerful computer ever built. Thanks to Moore’s Law, however, the Cray-1 is now as powerful as the first generation iPhone. But recently, Moore’s Law has in a sense peaked as processors cannot be built to work much faster due to heat production, which would cause failures as the tiny transistors literally start melting. So, supercomputer designers have been forced to build increasingly larger systems in order to cram more blades into one system. Despite the advances in processor hardware design, power efficiency has been neglected, as energy consumption has increased both to power the hardware as well as the robust cooling systems necessary for removing all of the heat generated by these giant computers.

Case in point is the latest version of the Cray supercomputer, the Cray XE6, housed at the Lawrence Berkeley National Lab (LBNL). Called Hopper, this massive system is the eight most powerful in the world and uses about three megawatts of electricity, roughly the same amount as 2,000 homes. According to an interview with John Shaft, a researcher at LBNL, in less than ten years from now, the much anticipated exascale class of supercomputers will consume about 100 megawatts of power — the amount of power consumed by a small U.S. city such as Novato, CA. The irony of a supercomputer used to model climate change burning up this amount of energy is not lost on Shaft, who says that chip makers need to start applying the same approach used to develop powerful and energy efficient mobile phone technology to HPC hardware design.

Below is a KQED News interview with Shaft and Dag Spicer, curator at the Computer History Museum in San Franscisco, discussing the history of supercomputers and Moore's Law.


QUEST on KQED Public Media.

Not surprisingly, Sumit Gupta at Nvidia has responded to this article with a blog post on how GPUs are paving the way forward as “a much more energy efficient way of supercomputing.” While he does name some HPC systems that incorporate GPUs — and are more energy efficient than their predecessors — the inherent limitations of graphics processors means that they will most likely only ever be integrated as accelerators and will never be used as the basis for a supercomputer, removing the energy hungry CPUs or FPGAs from the picture entirely.

Google Pulls Plug on Google Health

By Matthew Dublin

After only a three-year run, Google has decided to close shop on Google Health, its central repository for patient health information, including prescriptions, medical history, medical records, and more. Google's Web-based venture into personalized medicine is being discontinued due to a lack of adoption and partnerships.

According to the Google blog, "with a few years of experience, we've observed that Google Health is not having the broad impact that we hoped it would. There has been adoption among certain groups of users like tech-savvy patients and their caregivers, and more recently fitness and wellness enthusiasts. But we haven't found a way to translate that limited usage into widespread adoption in the daily health routines of millions of people. That's why we've made the difficult decision to discontinue the Google Health service."

In order to be successful, Google Health services also needed lots of partnerships with insurance companies and medical institutions to make data available to consumers. Although some deals were struck — including a partnership with CVS to integrated prescription drug data into the platform — as of last year, the service still needed hundreds of insurers to sign up and just couldn't gain enough traction.

Google Health will continue to be operational until January 1, 2012, and users will be allowed to export all of their data for an additional year after that. Any data that does remain on Google's servers will be permanently deleted, although what deletion methods will actually be used is not described.

In the ensuing weeks, Google will allow users to transfer their data to other services that support the Direct Project protocol, an open-source standard for health data exchange.

Google's failure to get the service up-and-running will leave a vacuum for Microsoft's HealthVault service to gain in prominence. Also launched in 2009, HealthVault has recently allowed users to upload and download X-rays, ultrasounds, and MRIs, as well as added support for mobile devices, including Windows Phone 7, Apple's iOS, and Google's Android.

Back in 2008, Microsoft lent its health service platform to help support a Scripps Translational Science Institute study that scanned 10,000 individuals affiliated with Scripps Health for more than 20 different health conditions. That HealthVault has been plugged into genomics from the start may mean that this platform could stand the test of time moving forward as personal genomics grows.

Supercomputer Aids Parkinson's Research

By Matthew Dublin

Researchers at Oak Ridge National Laboratory and North Carolina State University used ORNL's 2.33 petaflop Jaguar supercomputer to study how copper induces misfolding in the protein alpha-synuclein, which is associated with Parkinson's disease. This misfolding leads to the creation of fibrillar plaques that characterize the disease.

"We knew that the copper was interacting with a certain section of the protein, but we didn't have a model for what was happening on the atomic level," says Frisco Rose, a North Carolina State University graduate student in physics. "Think of a huge swing set, with kids all swinging and holding hands — that's the protein. Copper is a kid who wants a swing. There are a number of ways that copper could grab a swing, or bind to the protein, and each of those ways would affect all of the other kids on the swing set differently. We wanted to find the specific binding process that leads to misfolding."

Rose and NC State colleagues Miroslav Hodak, research assistant professor of physics, and Jerzy Bernholc, director of the NCSU Center for High Performance Simulation, developed a series of computer simulations designed to identify the most likely protein binding scenario. The number of calculations was so large that standard deskside computers would not suffice, so the team reworked the code to run on ORNL's Jaguar.

The results of their research appeared in the June 14th edition of Scientific Reports.

Japan Takes No. 1 Spot on Top500

By Matthew Dublin

A Japanese supercomputer has knocked a Chinese system from the No.1 spot on the Top500 list. The 37th edition of the Top500 List released today at the International Supercomputing Conference in Hamburg has named Japan's K Computer, which is capable of a whopping 8 petaflops peak performance — that's eight quadrillion calculations per second — as the world's fastest supercomputer. Last year, China caused quite a stir when it claimed the number one ranking with its 2.6 petaflop Tianhe-1A supercomputer at the National Supercomputing Center in Tianjin, as tinges of a Cold War competitiveness colored discussions of the system in the HPC community in the West.

The US now holds third place on the list with the 1.75 petaflop Jaguar supercomputer, a Cray system located at the US Department of Energy's Oak Ridge National Laboratory. The last time Japan was anywhere on the Top500 list was in 2004 with the "Earth Simulator" supercomputer.

The new Japanese system, housed at the RIKEN Advanced Institute for Computational Science in Kobe, was built using homegrown hardware courtesy of Fujitsu. The K Computer combines 68544 SPARC64 VIIIfx CPUs, each with eight cores, for a total of 548,352 cores. Unlike the Tianhe-1A system, the K Computer is a purely CPU-based system and contains no GPUs or other specialized hardware accelerator boards. With almost twice as many cores as any other system in the Top500, the K Computer is actually more powerful than the next five systems on the list combined.

And if an 8 petaflop system isn't impressive enough, the K computer's name is derived from the Japanese word "kei" for ten quadrillions or 10 petaflops, which is the performance goal of the K computer's designers.

A K Computer rack:

While the K Computer does consume almost 10 megawatts of power, for its performance level, it is actually also the most power efficient system on the list. The average power consumption of supercomputers on the list is around 540 kilowatts.

Supercomputers are ranked on the Top500 list according to their performance crunching the Linpack benchmark, a program developed by Jack Dongarra at the University of Tennessee for solving a system of linear equations using matrix computation.

DTU Installs Supercomputer for Systems Bio Research

By Matthew Dublin

The Center for Biological Sequence Analysis at the Technical University of Denmark, or DTU, has installed a supercomputer for systems biology and sustainable biotechnology industrial processes research. Called Anakyklosis — the Greek word for "recycling" — the new system is based on SGI's Altix UV 1000 technology and uses a shared-memory architecture to better enable processing of large data sets. With its current design, the system is able to store and process roughly 2,500 human genomes in its working memory at one time.

"The need for larger and faster computers has become very urgent due to the development of the metagenomics research area," said senior researcher Nikolaj Blom from the new Novo Nordisk Foundation Center for Biosustainability at DTU in a press release. "This deals with mapping the entire genome content of bacterial communities, such as those found in the deep oceans, in wastewater or in our own gut — the resulting amount of data is several thousand times larger than the entire human genome."

Anakyklosis will be put to work on metagenomics systems biology including the identification of new enzymes for the biotech industry and the construction of biological cell factories that will produce inexpensive and sustainable raw materials that could be used in place of oil.

"Systems biology involves research that combines and integrates extremely large data sets, including genetic information. ... The computer's capacity will considerably expand our ability to answer the basic biological questions we face, such as how to get a cell to produce something it was not originally made for," added Søren Brunak, director of the Center for Biological Sequence Analysis, in the release.

Anakyklosis is already linked up to other supercomputers at the center to create a large distributed computing resource.

Making the Most of Galaxy

By Matthew Dublin

Videos of presentations from the 2011 Galaxy Community Conference are now up on Vimeo. The conference is organized by the Netherlands Bioinformatics Centre (NBIC) and was held at the Conference Centre De Werelt in Lunteren in the Netherlands.

Galaxy is an open-source, scalable framework for software tool and data integration. The Galaxy project currently contains a slew of bioinformatics software applications including next-generation sequencing tools and workflows for metagenomics, ChIP-seq, RNA-seq, and much more.

The meeting covered a range of topics, such as the development of new tools, visualization, programming with the Galaxy API, and Galaxy CloudMan. The conference was aimed at software engineers, IT professionals, and analysis tool developers. Application areas touched upon in the talks included biomedical informatics, Taverna workflows in Galaxy, moving data onto Galaxy, next-generation sequencing for pathogen genomes in the clinic, and proteomics data analysis, just to name a few.

Illumina software engineer Kirt Haden presented a talk on how Illumina implemented Galaxy for high-throughput sequencing data analysis. Haden says that he and his team analyze up to 100TB of data per month, and needed a scalable tool that was customizable, allowed for easily automated workflows that were reproducible. They have already developed some customized graphical user interface tools, which they are submitting into the Galaxy code base, including theConsensus Assessment of Sequence and Variation (CASAVA) software, which they've demonstrated on very large data sets.

Freddy de Bree, a researcher at the Central Veterinary Institute at Wageningen UR, gave a talk on integrating tools into Galaxy to facilitate pathogenomics analysis and visualization. Bree and his colleagues are using Galaxy to elucidate zoonoses, including diseases such as Q-fever, Rift Valley virus, Swine fever, and PRRSV. He is also using a Galaxy platform for the development of diagnosis and intervention tools. In his talk, he described how they have Galaxy platforms up and running on a six-core development server and a 32-core production server where they use the pathogen detection program eDetectiV (R package).

About 15 videos of Galaxy presentations from the conference are now available, so head on over to Vimeo and check them out if you're currently using, or considering using, Galaxy for your omics projects.

The Trouble With Supercomputers

By Matthew Dublin

Scott Fulton has a lengthy post that explores some of the challenges facing HPC developers and the question of whether or not supercomputers will still matter going forward.

Historically, the "next big thing" in computing has always been something big, literally, where men in white lab coats and Keds would unveil a device that consumed an entire floor of a building which only a few specialized technicians could operate.

But Fulton writes that "for the last two or three generations, the computer, by definition, has been something small. 'The next big thing,' as presented by cool-looking guys in black turtleneck sweaters and blue jeans, has been something you hold in your hand. It's something folks can stare at and be amazed while congregating at the coffee shop."

We're sure he's not talking about Apple here…

Fulton goes on to point out that, although it looks like the performance of supercomputers have steadily increased, the expectations being placed upon them far exceed what their current software and hardware architecture can handle.

A number of HPC experts are quoted in the article, including John McCalpin, research scientist at the Texas Advanced Computing Center. According McCalpin, "the problem is that we want to eat our cake and have it, too…We want machines that are easier to program. On the other hand, we have a tendency to purchase machines based on their peak performance. And as a consequence, we don't provide the economic incentives for any vendors to make machines that are easier to program."

While the greatest breakthrough is supercomputing has been the adoption of commercial off-the-shelf, or COTS, processors, which enables designers to build supercomputers with thousands of stacks of multicore processors. In the pre-multicore era circa 2002, COTS were responsible for less than 10 percent of the Top 500 machines. By November 2010, roughly 84 percent of the compute power on the Top 500 list was comprised of x86/x64 processors. What makes this possible is the development of interconnect technology, but as McCalpin points out in the article, these interconnects cannot address the architectural deficiencies in off the-shelf-processors.

Fulton points to a 2010 report issued by DARPA that says out, traditionally there are essentially four ways to enhance the performance of a COTS-based supercomputer: increasing clock speed of CPUs, decreasing supply voltage supply to enable tighter component integration, increase the number of transistors on the CPU, and the use of interconnects like InfiniBand. But as the report states, "current interconnect protocols are beginning to require energy and power budgets that rival or dwarf the cost of doing computation."

Cloud Computing and Traditional Chinese Medicine

By Matthew Dublin

Researchers from China Medical University in Taichung, Taiwan, have released the first compute cloud for virtual screening and de novo drug design that uses data on traditional Chinese medicine. Called iScreen, this cloud computing resource is aimed at exploring traditional Chinese medicine for drug development using the TCM Database@Taiwan, a database of traditional Chinese medicine. For de novo drug design, iScreen provides users with multiple molecular descriptors and a graphical user interface that contains a protein preparation tool that extracts proteins of interest from raw input files and estimates ligand binding site size.

While there are numerous virtual screening Web servers currently online, such as PLANTS, 3DLigandSite, PharmMapper, and DOCK Blaster, iScreen says that it is unique as it is the first to provide a Web-based, computer-aided drug design for both TCM dock and de novo drug design.

According to an e-mail exchange I had with iScreen developer Calvin Yu-Chian Chen, an associate professor of the School of Chinese Medicine, all three separate program subunits, including the preparation tool, screening system, and de novo design kit, could be operated through iScreen interface, which is built on CentOS. The Web-based interface also incorporates with the queuing system for collecting and sending jobs to the process core.

Chen went to on to add that the most challenging aspect of developing iScreen was bringing the technical aspect of the screen operation to interactive graphical interface. "We aimed to provide a user-friendly system that can perform standard structure-based drug screening as well as more advanced operations," he wrote.

Chen's team is planning to expand iScreen to a VMware ESXi-based hypervisor to allow the user to obtain more computing resources.

NSF Releases iPad App

By Matthew Dublin

The National Science Foundation (NSF) is getting on the iPad bandwagon with its Science360 application, now available in Apple’s iTunes App Store.

Science360 allows users to share content on Facebook, Twitter, or email, including video and high-resolution images from NSF-funded institutions. The app will also help users keep track of breaking science news from institute press offices, peer-reviewed journals, NSF centers, as well as funding entities.

"iPads are becoming more prevalent, more reliable and more viable for locating information," said Dana Topousis, acting NSF director of public affairs. "This application will provide at your fingertips another avenue for sharing and accessing breaking science news."

Science360 delivers on its name — images and content are presented to users in a panoramic-like interface with a spherical layout.

Here's a screen shot:

Hacker-Approved Security and Flash Memory for the Cloud

By Matthew Dublin

On the cloud security front — probably one of the hottest topics of Cloud Expo 2011 — cloud hosting companies like Firehost are beginning to focus on HIPAA compliance. I had a chance to speak with Bruce MacFadyen, the COO of Firehost, who said they have already racked up a number of life sciences research and health care institute customers. One is the National Breast Cancer Foundation, which switched over to hosting its site on FireHost and immediately saw a 15 percent decrease in Web traffic. It turns out that this decrease was actually due to a drop in malicious activity on the site at the application layer.

Last October, the biostatistics center at Johns Hopkins University and the department of surgery at Duke University School of Medicine selected FireHost to host special projects within their health care divisions that involved confidential patient and research data. And medical desktop and mobile software developer cGate Health also adopted FireHost to help them maintain HIPAA and HITECH Acts compliance while sending and receiving massive quantities of patient test result data in electronic medical records.

According to MacFadyen, the real challenge with public clouds is that many use open-source hypervisor software — the software layer on a cloud that allows users to create virtualized compute instances — which is obviously open for attack by hackers who have ready access to the code. To be clear, this is only a problem with public clouds and not a private cloud that one could rent on Amazon wherein you're reserving dedicated hardware that only your cloud will be hosted on.

For what it's worth Kevin Mitnick, one of the most famous hackers in history, selected Firehost to host his website back in 2009.

Boasting big name customers including everybody from Facebook to the Lawrence Livermore National Laboratory, Fusion-IO’s Victor Brisebois made a case for NAND flash memory as the hardware of choice for hosting a cloud and maximizing virtualization in the data center. Fusion-IO, which is the first vendor to release a PCI-compatible NAND flash memory board, offers customers several different configurations of their “ioDrives” — PCIe cards kitted out with solid-state flash memory with up to 5.12TB of storage.

last year, LLNL replaced roughly 137 racks with hard drives with just two Fusion-IO flash servers for its Hyperion Data Intensive Testbed.

Brisebois said to think of NAND flash memory in this context: as very cheap RAM that can be used for hosting acceleration. The basic pitch for Fusion-IO’s solution is that it can be employed to completely replace the Storage Area Network, or SANs, piece of a network architecture — SANs are dedicated storage networks used to make disk arrays or tape libraries accessible to other servers. This means that users can dump an entire database on one NAND flash card and eliminate networking latency or host a private cloud using a fraction of the physical space and power of a traditional network architecture.

Cloud Expo 2011

By Matthew Dublin

It was a bit telling when, during the large general session talk that kicked off this week's Cloud Expo 2011 in New York City, speaker Bill Zack, an "architect evangelist" with Microsoft, enthusiastically asked the audience of roughly 1,000 plus attendees:

"Show of hands — how many of you people believe that the cloud means infinite compute?"

A few dozen hands went up.

"Alright! Show of hands — and how many of you people believe that the cloud means infinite storage?"

No hands that I could see went up.

After the sound of crickets died down, Zack recovered by saying, "Well, it all depends on what your definition of 'infinite' is..."

Indeed. And it also depends on what your definition of cloud computing is, unless, of course, you're OK with the idea of meaning as a moving target. That there is still such malleability among cloud zealots when it comes to that definition is directly connected to the overblown expectations and the confusion surrounding cloud computing. Zack was, of course, aware of this fact and, after finishing his pitch for Windows Azure, he wrapped up his talk by telling the audience to not become despondent. Keep an eye out for the National Institute of Standards and Technology, he said, which is still hard at work nailing down its definition of cloud computing once and for all.

In an attempt to steer clear of any nebulousness and with an eye toward the pragmatic, Adiascar Cisneros, a professional services manager at Racemi, gave a talk on "on boarding" which is cloud-speak for moving existing software applications onto a cloud. In a packed ballroom at the Jacob Javits Convention Center, Cisneros started off his talk by looking at various use cases for determining when a private cloud, public cloud, or hybrid cloud, is the best choice.

When one audience member asked him what was the difference between hosting your own applications on servers and a private cloud Cisneros basically said that a private cloud was like hosting your own applications on in-house servers just...better.

Possibly anticipating the confusion, Cisneros wisely took pity on us all and offered up a simple formula for determining which applications to move to the cloud and onto which cloud architecture to move them. The basic guidelines are:

If your application needs elasticity, users require it to be highly available, and you're also looking to reduce capital expenditure costs for operation, look to the public cloud, like Amazon's EC2.

However, if security is paramount, you are dealing with regulatory and compliance issues, you need top-notch performance, and you have already put a lot of money into developing this software (referred to as "sunk costs), then a private cloud is for you.
At the end of his talk, Cisneros plugged Racemi's DynaCenter solution, which provides automated configuration and migration of a cloud set-up from private to public, public to private, or either to hybrid and back again. DynaCenter supports Linux, Unix, Windows, VMware, Xen, and Hyper-V and the user is supposed to see nothing as an administrator moves the cloud around from a dissimilar hardware or cloud providers.

HPC Experts Grapple With Personal Genomics

By Matthew Dublin

At the National Center for Supercomputing Applications' Private Sector Program Annual Meeting, the NCSA's Victor Jongeneel presented a talk on HPC bottlenecks and potential solutions that are holding personal genomics back.

According to Jongeneel, one of the biggest challenges is the lack of scalability for popular genome alignment and assembler codes. For example, his team ran a test on ABySS, assembling a modest sized genome of a yeast and found that, based on wall clock and memory requirements, the code is inherently not scalable. Ultimately, Jongeneel says that this is a result of the fact that many folks developing genomics software are not professional developers, and while the code is complex and innovative, most bioinformatics code is not up to the standards of the HPC community.

When an audience member who identified himself as a representative from Microsoft asked Jongeneel about what the ideal solution might be to this problem, he responded that since most of the code that is produced is research grade, and the technology moves so quickly, it often renders "new" code obsolete in a short time. He went on to say that commercial attempts have also fallen short of addressing these challenges for the same reason, because as soon as scalable solution is produced, the rapid movement towards a new solution leaves them in the dust. Jongeneel said that ultimately a fundamental rethinking of compute architectures that allow for workflows with multiple complex steps will be key for making personal genomics a reality.

Click here to watch the whole presentation.

The Limits of Linux

By Matthew Dublin

Henry Newman of enterprisestorageforum.com and Jeff Layton, enterprise technologist for HPC at Dell, have devised a plan to test the limits and define problems associated with the scalability issues of Linux file systems. Both Newman and Layton agree that one of the big problems with Linux file systems is the metadata scan rate:

Let's say you have 100 million files in your file system and the scan rate of the file system is 5,000 inodes per second. If you had a crash, the time to fsck could take 20,000 seconds or about 5.5 hours…THIS IS NOT ACCEPTABLE. Today, a 100-million file system should not take that much time, given the speed of networks and the processing power in systems. Add to this the fact that a single file server could support 100 users and 1 million files per user is a lot, but not a crazy number. The other issue is we do not know what the scan rate is for the large file systems with large file counts. What if the number is not 5,000 but 2,000? Yikes, for that business. With enterprise 3.5 inch disk drives capable of between 75 and 150 IOPS per drive, 20 drives should be able to achieve at least 1,500 IOPS. The question is what percentage of hardware bandwidth can be archived with fsck for the two file systems?

According to their article, the file system community has not taken these concerns seriously, which is why the concept of a 500 TB single name space Linux file system is still — surprisingly — years away.

This is just the beginning of the series these guys are writing on the issues with Linux file systems, which you can follow here. They plan on publishing the description of their test next, followed by testing reports, and finally an analysis of the results.