Intel's Path to Exascale Computing Very, Very Hot

By Matthew Dublin

In order to make exascale computing a reality, engineers at Intel are aiming to push Moore's Law as far as it can go — 100 times faster than current processors operate by the 2018 to be exact. One approach to exascale-capable processor design is to stack chips and transistors on top of each other so that processors will be built more like cubes rather traditional flat chips.The reason for a cube shaped chip architecture is to facilitate faster data transfer in and out of the processor cores.

As Steve Pawlowski, an exascale computing researcher an Intel senior fellow, explained tothe Register at the recent European Research and Innovation Conference in Ireland, by stacking on top of the CPU die designers can make wider memory interfaces or increased bandwidth, thereby reducing the length a signal has to go from A to B.

The only problem with stacking is the one its implemented to surmount — Moore's Law. In lab experiments so far, the heat created by stacking chips has the undesirable effect of melting the chips. Despite this unfortunate side-effect, Pawlowski is convinced that this design is the way to go as the answer will not be found in constructing HPC chips with new materials, such as graphene which IBM has already shown can be used to create very fast chips.

"Every time I hear this technology is going to run out of gas in ten years and we’re going to need something new, there’s always some new way of engineering or some new creative way to use the material that gives you a longer life," says Pawlowski.

Dell Adds Ocarina Network Software to Hardware

By Matthew Dublin

Dell is now packaging its DX6000 storage racks with software that uses Ocarina Network's data compression solution.

Ocarina uses a deduplication that compresses a multitude of supposedly incompressible image files and its approach is so effective that Dell bought the startup last year. Ocarina began making an appearance at Bio-IT World a few years ago and has been turning heads with its innovative compression software and has an entire suite tailored made for bioinformatics.

Ocarina's ECOsystem includes over 100 algorithms that support over 600 file types, such as large, graphics intensive genomics files like Affymetrix’s .dat, .arr, and.cel outputs, for which the Ocarina ECOsystem achieves an average of 50 percent compression; and Illumina files such as tiff and .txt, for which it achieves 45 percent to 85 percent compression.

The can be single or multiple clustered nodes. It is anticipated that there will typically be from one to four nodes, but there can be more and the system has been tested with up to 2.5PB of data stored across 46 DX6000 storage nodes. The DX6000Gs contain up to 8TB of disk capacity for scratch files using SATA or nearline SAS disks.

The DX6000G Storage Compression Node (SCN) can be either single or multiple clustered nodes, although they will eventually go up to four nodes. The DX6000G has been tested on 2.5 petabyte of data but contains up to 8TB of disk capacity using SATA or nearline SAS disks.

Innovative Chip Design for HPC

By Matthew Dublin

Samsung and Micron have announced a partnership to developed a new type of memory chip, designed specifically for HPC. Called the Hybrid Cube Consortium, the new partnership is ambitiously claiming to build a chip with 15 times the memory bandwidth of current HPC chips. If they develop an approach that really works seamlessly in a multi-terabyte environment, there could be a slew of application areas for bioinformatics.

While the Hybrid Cube Consortium plans of releasing their new chip architecture sometime in 2012, HP has announced that intends to release its "memristor" memory—a two-terminal non-volatile memory technology—in the next 18 months. HP's memristor is being pitched as a replacement for flash memory, specifically DRAM (direct random access memory) in three to four years, and then SRAM (static-random access memory).

The Hybrid Memory Cube approach is to place the logic layer of the chip on the bottom and the memory densely stacked on top in a cube as opposed to a flatter type of architecture common in current chips.

A Conversation with NYU's Bud Mishra

By Matthew Dublin

We had a chance recently to sit down with NYU's Bud Mishra, who discussed the work of the NYU Bioinformatics Lab including FPGA software development and their new base-calling algorithm called TotalReCaller.

Amazon Greenlights Linux Cloud

By Matthew Dublin

Amazon Web Services (AWS) has removed the "beta" tag from their Amazon Linux AMI cloud that was launched about a year ago. The Linux cloud is now allegedly stable and secure, and adoption has been steady, according to the AWS blog. Current Linux AMI users have the choice of either upgrading to the most recent release or launching brand new instances.

According to the announcement on their blog, in addition to making the Linux cloud "alpha," AWS is also adding a ton of new features and services, including the following:

• The Message of the Day now tells you when updates to installed packages are available.
• While the AMI’s default configuration is set to provide a smooth upgrade path from release-to-release, you can now lock the update repositories to a specific version to inhibit automatic updates to newer releases.
• Security updates are automatically applied on the initial boot of the AMI. This behavior can be modified by passing user data into the AMI with cloud-init.
• There's a new Amazon Linux AMI Security Center.
Puppet has been added to the repositories and is available for system configuration management.
• Access to the Extra Packages for Enterprise Linux (EPEL) repository is configured, though not enabled by default. EPEL provides additional packages beyond those shipped in the Amazon Linux AMI repositories, but these third party packages are not supported.
• The cfn-init daemon is installed by default to simplify CloudFormation configuration.
• A total of 50 new packages are available including the command line tools for AWS, Dash, Dracut, Facter, Pssh, and Varnish. 227 other packages have been updated and 9 have been removed. For a full list of changes, refer to the Amazon Linux AMI Release Notes.

SDSC Host Largest Academic Cloud

By Matthew Dublin

The San Diego Supercomputer Center at the University of California, San Diego, has launched what it claims is the largest academic-based cloud storage system in the country. The system is capable of an initial raw 5.5 petabyte of storage and is 100 percent disk-based with high-speed 10 gigabit Ethernet network interconnections.

SDSC's Cloud uses two Arista Networks 7,508 switches, providing 768 total 10 gigabit Ethernet ports for more than 10Tbit/s of non-blocking, IP-based connectivity. The switches are configured using multi-chassis link aggregation, or MLAG, for both performance and failover. The cloud is built atop of the OpenStack Swift Object Storage, a scalable, open-sourced cloud operating system that was developed by NASA and Rackspace Hosting in 2010 — it currently powers some of the largest public and private cloud computing services using this scalable and proven software.

The SDSC Cloud will work with the other storage technologies at the supercomputer center, including the Data Oasis system, a Lustre-based parallel file system researchers running jobs on Triton, Trestles, and Dash supercomputers use to store their results and datasets.

Users can store their data in the cloud for as low as $3.25 per Month for 100GB or $32.50 per Terabyte per Month with no transfer costs.

Subterranean Data Center Cooling

By Matthew Dublin

The IT team at the US Department of Energy's National Renewable Energy Laboratory in Golden, Colorado, has built a data center that pipes air underground for a cheap and effective cooling solution. The NREL's Research Support Facility building that houses the data center is an Leadership in Energy and Environmental Design Platinum, or LEED, building designed to be "net zero" on energy usage. LEED is an internationally recognized "green" building certification system that uses metrics for water efficiency, CO2 emissions reduction, and energy savings.

Their data center rarely utilizes air conditioning and instead employs two separate systems to remove heat and cool the equipment. The NREL cooling system starts with outside air that travels through an underground channel and is cooled to roughly 54 degrees which it is pumped into the data center. The hot air from the data center is then pumped through the building for heating during the cooler months.

Remarkably, their data center has reportedly achieved a Power Usage Effectiveness rating of 1.15. The PUE metric is calculated by adding together the power required for the cooling system, the power management system, and the actual computer equipment, and dividing that number by the power required to operate just the equipment. The cooling and power management systems required to run their data center only requires one-tenth of a watt for every watt used by the actual computer hardware.

NCSA Fires Up CPU/GPU Supercomputer

By Matthew Dublin

The National Center for Supercomputing Applications has announced the rollout of its Forge supercomputer — a 153 teraflop system that uses CPUs and and GPUs.

The majority of compute time on Forge will be offered through the National Science Foundation's Extreme Science and Engineering Discovery Environment, or XSEDE program, a nationwide partnership of 20 institutions. XSEDE is led by the NCSA and provides resources, including services, tools, and technical support to the research user community. Whatever time is left on Forge will be allocated to the NCSA's Private Sector Program, an initiative that joins commercial businesses with NCSA research staff, as well as to faculty, staff, and students at the University of Illinois at Urbana-Champaign.

Forge is comprised of 18 Dell PowerEdge C6145s that contain 36 nodes of dual-socket/eight-core AMD processors, with M2070 NVIDIA Fermi GPU units housed in Dell's C410x PCI expansion enclosures. There are eight Fermi units for each node, for a total of 288, and each NVIDIA M2070 provides more than 500 gigaflops of double-precision performance and 6GB of GDDR5 memory. A QDR InfiniBand interconnect fabric will interconnect the nodes; 700 terabytes of GPFS filesystem space will be provided by two Data Direct Networks SFA 10000 units. The I/O bandwidth of the system is expected to exceed 16GB/sec.

DIY Data Center

By Matthew Dublin

Facebook’s hardware team has opted to design their own server hardware instead of building a server farm using commercial solutions. According to Bloomberg, this marks a growing trend wherein companies like IBM, Dell, and Hewlett-Packard are losing considerable sales as Google, Facebook, and other big data companies are taking the do-it-yourself approach.

20 percent of the U.S. market for servers is comprised of customized machines and computer makers are feeling the pain. For the big data houses and server farms that support Google, Facebook, and Amazon a no-frills rack unit, that only has the most basic of components, is far more energy efficient and cheaper than any sleek packaged commercial blades.

For example Facebook’s DIY servers have custom power supplies and circuit boards in sheet-metal enclosures designed to maximize airflow with the minimum number of fans. Google’s servers are built only with the barest components necessary to run their striped-down version of the Linux operating system, and Microsoft has designed servers to host its cloud computing service.

In 2009, Google first unveiled the design of its servers, which it had previously been tight-lipped about. Each server has its own 12-volt battery to supply power if there's a problem with the main source of electricity. Power-efficieny is a fundamental obsession of Google's server design team, and this design has by now by through at least ten iterations.

Google's homemade servers as of 2009:

As Bloomberg reports, the computer industry has little choice but to adapt and the trend of using cheaper servers for cloud computing may continue to spread to other application areas as well. Given the DIY spirit of the bioinformatics community, and its growing interest in cloud computing, homegrown server hardware could become an attractive option for hosting private clouds with open source software layers.

IBM Patent Describes 100 PFlop Supercomputer

By Matthew Dublin

An IBM patent filing provides some insight into the architecture on IBM's BlueGene/Q "and its successor, a system that will have close to 8.4 million processing cores and consume 16 megawatts, which is being touted as the first 100 petaflop supercomputer. BlueGene/Q will come to life as the "Sequoia" system at the Lawrence Livermore National Laboratory (LLNL) some time next year. The Sequoia system will have 1,572,864 cores, 98,304 compute nodes and 96 racks.

The patent contains more than 649 pages and 2263 individual claims and descriptions describes the basic architecture of the system consists of 1024 compute node ASICS that are built into 512 racks. Each compute node is comprised of a 4-way hardware-threaded quad-core PowerPC A2 CPU architecture and each unit has 18 cores: one core to improve chip yield, one for system control, and 16 for actual computation. Each node includes 32 MB of memory with a total memory bandwidth per node is 563 GB/s.

Bluegene/Q node:

While the BlueGene/Q system at LLNL will reach a peak performance of 20 petaflops, IBM's patent describes the system as being capable of 107 petaflops.

Five years ago, IBM's BlueGene/L, capable of 280.6 TFlop/s, was the most powerful system on the block.

But if Big Blue can deliver on what they promise in their patent, then this new BlueGene/Q-based system could be 381 times faster than BlueGene/L.

Linux DNS Servers & Isilon's "Smart Connect" Feature

By Matthew Dublin

The BioTeam has a highly-technical but useful post describing the integration of Linux Dnsmasq DNS Server with Isilon Smart Connect, a nifty feature that allows users to connect all 8 10Gigabit Ethernet interfaces on an Isilon Network Attached Storage (NAS) array.

On "vanilla" NFS shared filesystems in demanding high-performance computing environments, network links often become the throughput bottleneck with usually only painful workarounds. But Isilion's Smart Connect lets you basically automate NFS failover and multi-Network Interface Controller (NIC)/multi-node NFS load balancing without the need to install additional software on each node.

The magic behind Smart Connect is that each Isilon NIC gains multiple IP addresses as the IP address pool spreads across all NICs enabling each NFS client machines to do an NFS mount to a unique IP address.

BioTeam was motivated to post the tutorial, which includes a few command line and utility screen shots, because Isilon's administration guide only explains how to establish DNS zone delegation on a Windows system — not much help for those in academia, a substantial percentage of their customers.

Head on over to their post for a step-by-step instruction on the Linux configuration of Smart Connect, and be sure to thank them for saving you from the ugliness of trying to figure this out on your own.

BGI Bemoans Absurdist Data Transfer

By Matthew Dublin

Given the current state of the United States Post office, that the largest sequenced data sets are still primarily transferred by mail is not heartening. In fact, according to Li Yingrui of the Beijing Genomics Institute “this is absurd.”

In an International Science Grid This Week article that looks at informatics challenges facing the 1000 Genomes Project, Yingrui, as well as other 1000 Genome Project participants including David Altshuler of the Broad Institute and Phil Butcher, head of IT at the Sanger Institute, describe their frustrations when it comes to dealing with their current data bottleneck. After finishing the first 1,000 genomes in mid-2010 they are now aiming to sequence 2,500 genomes, and the because of the current limitations of Internet bandwidth, the cloud won’t do.

“The main issue for us is that our data sizes are so large, that the cost and difficulty of moving the data to the cloud stops it being cost effective for many jobs. We do use the cloud for the Ensembl genomes database, but only to provide [data] mirrors that are closer to users,” says Butcher.

The IT leaders of 1000 Genomes Project describe how they must “distressingly often resort to shipping hard disks around to transfer data between centers, rather than use the internet, or even via Aspera which is faster than ftp [file transfer protocol].”
The issue is so dire that BGI has established an open access journal, Giga Science, to deal with the problem of data dissemination and organization.

Grid Computing Aids Cell Stress Research

By Matthew Dublin

Researchers at the San Diego Supercomputer Center are studying cell stress — but instead of using high-performance computing hardware, which includes their local CTBP cluster and the Triton supercomputer, the team opted for good old fashioned grid computing.

The project, led by Nick Schafer, a biophysicist in the Wolynes group at the Center for Theoretical Biological Physics at University of California-San Diego, needed some serious high-performance computing power to study NF-kappaB and I-kappaB proteins, which are both implicated in cell stress response.

In order to study the phenomenon known as "misregulation" which occurs when NF-kappaB and I-kappaB are not working properly, thousands of molecular simulations have to be run, which requires more computational muscles than SDSC had to offer locally.

“Many of our simulations can be run efficiently on a single processor but require many of these simulations to run simultaneously, which makes OSG particularly well suited to our purposes,” Schafer said. “We have had recent technical advances which, combined with the OSG resources, should allow us to move faster than ever towards determining which specific structural aspects are important in the interactions between NF-kappaB, I-kappaB and DNA.”

Their solution was to add the Open Science Grid (OSG) to their HPC arsenal. The OSG is a network of distributed computing resources for scientific research headed up by the Open Science Grid Consortium, an organization composed of service and resource providers. Members independently manage their own resources and the consortium proves the framework for integration.

The OSG currently offers users a range of toolkits specifically for biology and medical research, including the Geant4 software Toolkit, the Grid Laboratory of Wisconsin, the nanoHUB Network for Computational Nanotechnology, and the Structural Biology Grid.