IBM Lifts Hood on Blue Gene/Q SoC

By Matthew Dublin

Moving on from the failed 10 petaflop Power 7-based Blue Waters supercomputer project for the National Center for Supercomputing Applications, IBM has now teamed up with two Department of Energy labs reaching for the multi-petaflop stars. IBM's Blue Gene/Q Soc technology will be used to develop the 10 petaflop "Mira" system at Argonne National Lab and the 20 petaflop "Sequoia" at Lawrence Livermore.

While the Power-7 chips are set to perform at 256 gigaflops per eight cores and eat up about 200 watts, the Blue Gene/Q SoC chips will provide 204 per processor with 18 cores and consume 55 watts at peak, thus delivering 15 times as many peak FLOPS than Blue Gene/P and 36 times as many as the original Blue Gene/L Soc.

What's special under the hood of the Blue Gene/Q is that it's the first commercially available chip to use transactional memory - a method of organizing related tasks into a single large job for more efficient processing. IBM will implement a transactional memory architecture on each chip using a tagging scheme on its level-two cache memory.
What's special under the hood of the Blue Gene/Q is that it's the first commercially available blade to use transactional memory - a method of organizing related tasks into a single large job for more efficient processing.

Philip Heidelberger, IBM Research staff member, described the interconnection network and message unit on Blue Gene/Q last week at the 19th annual Hot Interconnects Symposium on High Performance Interconnects. As explained in his abstract, “the network and the highly parallel message unit, which provides the functionality of a network interface card, are integrated onto the same chip as the processors and cache memory and occupy only 8 percent of the chip’s area, including IO cells.”

IBM Blue Gene/G:

Sell Your Cycles to Science

By Matthew Dublin

A Portland startup called CPUusage is offering folks a chance to turn their excess CPU cycles into a public computing grid for scientific research.

Think of it as Airbnb for CPUs.

Harnessing idle computers is not a new idea, but CPUsage's approach is a bit different. The startup offers computers owners a chance to build up points, similar to airline miles or credit card rewards, that can be redeem as gift cards for major restaurants and retailers.

CPUsage’s Cirrus Grid delivers a scalable and high powered Infrastructure-as-a-Service for high throughput and high performance computing. Good for SaaS and PaaS offerings, Cirrus Grid provides access to the raw processing power of thousands of physical nodes and is highly secure and able to run most batch-oriented applications and the libraries they require.

Below is a video featuring CPUusage CEO Jeff Martens:

When it comes to harnessing the power of unused computers, the Condor project out of the University of Wisconsin, Madison is the grandaddy of the them all.

VBI & Convey Computer to Develop Tools for Life Sciences Researchers

By Matthew Dublin

Thanks to renewed funding from the National Science Foundation, the Virginia Bioinformatics Institute at Virginia Tech is continuing its collaboration with hybrid-core hardware maker Convey Computer.

The $1M project, funded through the NSF's Strategic Technologies for Cyberinfrastructure (STCI) program, centers on Convey's FPGA designs and will allow developers to track efficiency and usage patterns, and develop tools that will be available on the web for researchers to download.

The program will also fund a two-week summer program for undergraduates aimed at giving students some hands-on experience with developing for hybrid HPC architectures.

VBI was an early adopter of Convey solutions, its hybrid-core system Shadowfax features Convey's HC-1 system and is used on a range of bioinformatics projects including everything from gene annotation to sequencing analysis. Shadowfax is a 212 core Intel 64-bit cluster, with 16 GPUs, three 64-bit Xeon processors, three Convey HC-1s with 12 Xilinx FPGAs.

"This partnership will usher in a paradigm shift in the way bioinformatics researchers process data by using Convey's hybrid-core platform and relevant technologies," said Harold Garner, VBI executive director. "Researchers are literally drowning in data," said Garner. "This grant allows us to leverage the most cutting-edge HPC technology to extract meaningful information from such a data-rich environment and move toward applications of lasting value for developers, bioinformaticians, and clinicians."

In May, Convey announced that its software GraphConstructor accelerates de novo genome assembly up to 8.4 fold on its hybrid-core architecture.

Exome Analysis on the Cloud

By Matthew Dublin

A post over at Blue Collar Bioinformatics describes a distributed exome analysis pipeline that allows users to run best-practice software alongside customized code in the cloud. All the components in the pipeline are open source and supported by large user communities. The setup uses CloudMan, a dynamically scalable version of Galaxy, as a platform to build a full SGE cluster environment and CloudBioLinux, a Linux bioinformatics software package. Communication between cluster nodes is facilitated by RabbitMQ with an automated pipeline written in Python that organizes parallel processing across the cluster.

The post has a few videos to explain the process, starting with establishing a cloud cluster on Amazon Web Services EC2 servers by following the CloudMan setup instructions:

After you've booted up your cloud, move over to the CloudMan web interface on the server and start up an instance using this shared identifier:

cm-0011923649e9271f17c4f83ba6846db0/shared/2011-08-19--21-00

Head over to the post to find more instructions on configuring RabbitMQ messaging to communicate between the nodes, running analysis with FASTQ input files, monitoring the running process, and retrieving results.

OSU's Translational Research Cloud

By Matthew Dublin

Bioinformatics developers at the Ohio State University Medical Center began development last year on a cloud resource to help researchers around the world access and analyze biomedical data called the Translational Research Informatics and Data management grid (TRIAD).

The project has been so successful that TRIAD's developers are now establishing a tech support center at OSU to help facilitate the growing demand.

Philip Payne, chair of the department of biomedical informatics at the OSU Medical Center, and his colleagues began work on TRIAD a year ago with support from the National Institutes of Health. The team was recently awarded an additional $300,000 to complete the project and facilitate its implementation at other academic institutes.

"With the current technology, a researcher might dedicate more than 100 hours to connect the dots between a set of tissue samples, the individual medical histories for the patients who provided those tissues, and then analyzing the group as a whole. With the TRIAD platform, researchers can now execute this type of search and analysis in minutes," said Payne. "When it comes to biomedical research, you have the digital equivalent of the Tower of Babel. One piece is written in French. And another is written in Russian. And maybe a third component is in Chinese…TRIAD acts like the ultimate interpreter between all the different ‘languages’ that biomedical data comes in so that researchers spend time figuring out how the information could improve the way we treat a disease rather than spend time finding and translating various data sets."

TRIAD is built using the same “framework” of the caGRID, the grid framework the Ohio State developed for the National Cancer Institute back in 2005. TRIAD works by pulling disparate data sets into a cloud where it’s converted into a language that a specific end user’s analytics software can utilize. The new cloud is also compatible with popular research databases including
REDcap
and I2B2.

So far, 20 research institutes have adopted TRIAD and it's expected that the number will increase due to the fact that its open source, is collaboratively designed, and boast lots of technical documentation and software components.

Amazon Launches AWS GovCloud

By Matthew Dublin

Amazon Web Services (AWS) launched the AWS GovCloud today (US region only) to help government agencies and contractors store and analyze their data on the cloud.

Given AWS' failures as of late, which have fueled concerns about the cloud's robustness and security, this might not seem like a feasible IT solution for big government data. But Amazon is claiming that they have implemented a number of US government-specific regulatory requirements.

As Amazon.com's CTO Werner Vogels writes on his blog, the US Federal Cloud Computing Strategy does compel US agencies to consider the cloud first as the target for their IT operations. Only time will tell if research agencies will take advantage of this new service, but the whole announcement does beg a few questions: why would AWS not apply these same security standards across the board for all their customers? Why not just make the cloud as secure as possible for everybody, even if they don't have the same regulatory requirements as government agencies?

To reiterate, just keeping the cloud consistently operational has proven to be a challenge, and maybe that's what AWS is looking to improve upon with this new service. The AWS GovCloud will have some substantial redundancy and is spread far and wide geographically, with data centers or "general purpose regions" on the west coast and east coast, one in Ireland — which was recently downed due to a lightening strike — and two in Singapore and Tokyo.

If you're not sure how you feel about having your social security number spread across the globe, let alone your genetic information, nobody's blaming you. But now that AWS has this new market, with its stringent security requirements, individual users or research sites will inevitably benefit from the lessons learned as the cloud hopefully becomes more secure and consistent for everybody.

EMBnet.news Morphs into New Bioinformatics Journal

By Matthew Dublin

EMBnet.news magazine has now expanded to become an open access, peer-reviewed journal, EMBnet.journal. The new journal will focus on routine data-analysis tasks, multi-faceted IT architectural infrastructures, and the challenges of data-driven life sciences research.

The first issue is available online now and focuses on next-generation sequencing data analysis.

Included in the maiden issue is a letter to the editor that discusses the Dicode EU project, an initiative that aims to exploit cloud computing, MapReduce, Hadoop, Mahout, and column databases to analyze and aggregate data existing in large and diverse sources.

Also included in the current issue is a paper entitled “GRID distribution supports clustering validation of large mixed microarray data sets.”

This paper describes a grid computing approach that drastically speeds up an unsupervised hierarchical clustering algorithm, called Chaotic Map Clustering, for sorting through microarray databases. Instead of taking days, this approach reduces search time to a few hours. They also offer an online service to cluster large microarray data sets via the Job Submission Tool portal.

Cloud Computing Conjurations

By Matthew Dublin

A post over at Titus Brown's Daily Life in an Ivory Basement blog makes an argument for why the cloud is NOT going to solve anybody's computational problems.

The basic gist of the post, which is chock full of more log-linear graphs than you can shake a dongle at, is that the main pitch of the cloud as a cheap and scalable storage and compute solution, perfect for dealing with the data challenges created next generation sequencing, is just plain wrong.

Brown puts Lincoln Stein's 2010 Genome Biology paper "The case for cloud computing in genomics" in the crosshairs and points out that when the cost of producing data (in this case sequence data) is decreasing faster than the cost of storage or compute hardware, the solution is not purchasing or renting more hardware i.e. adopting cloud computing. Instead, the solution, writes Brown, is addressing the need for more software and algorithm development to deal with all of the data, on whatever hardware you already have. But to be fair, while Stein's paper does explore the potential for the cloud as a scalable, on-the-fly compute solution to dealing with lots of data — which it is — he does point out the draw backs, including whether or not the cloud makes economic sense for one's particular situation and of course the reality check that is network bandwidth.

As one commenter on Brown's blog wrote: "I haven't heard people saying that the cloud is the One Solution that solves the [data volume] problem forever…"

And that’s sort of the point: no one is really disagreeing with Brown. None of the IT folks or bioinformaticians I’ve ever spoken with consider the cloud as the answer to the next-gen sequencing data challenge, but rather, as an interesting or attractive solution with lots of potential. Even technology evangelists like Deepak Singh, business development manager at Amazon Web Services, make no pretense that the cloud is a cure-all. Singh's infamous "There is no magic, there is only awesome: Scientific computing with Amazon Web Services" talk emphasizes the need for the research community to focus on software development that works at the scales cloud computing can offer and that the cloud is hardly a turnkey solution.

So maybe Brown’s argument really needs to be directed more at those in the academic community who are still under the impression that scaling to the cloud is magic.

For some much-needed comedic relief from the rabbit hole that can be a debate about cloud computing, check out Stephen Colbert’s take on the cloud:

More Amazon Cloud Outages

By Matthew Dublin

It looks like the cloud is having trouble weathering some storms as of late.

On Sunday, a lightning strike in Dublin, Ireland caused a failure in data centers powering both Amazon and Microsoft clouds. The lightning struck a transformer causing an explosion that also managed to knock out nearby backup power generators, resulting in a service outage that lasted a surprisingly short three hours.

However, software failures were reported to have complicated matters a bit for Amazon. An error in the Elastic Block Storage (EBS) software, a block level storage service for use with Amazon EC2 compute cloud instances, mistakenly reported that some of the blocks were no longer in use and so deleted them.

Amazon has addressed the error and is working on notifying customers and recovering their data.

As Amazon itself explained on its Service Health Dashboard: "Normally, upon dropping the utility power provided by the transformer, electrical load would be seamlessly picked up by backup generators. The transient electric deviation caused by the explosion was large enough that it propagated to a portion of the phase control system that synchronizes the backup generator plant, disabling some of them."

Continuing on with what has so far not been a stellar week for the cloud provider, on Monday, Amazon's Elastic Compute Cloud in North Virginia experienced severe connectivity issues that resulted in outages for several websites and services reliant upon AWS lasting roughly 30 minutes. Among the notable sites affected were Foursquare and Instagram, an iPhone photo sharing site.

In April, Amazon and its customers had their first sobering experience when a massive outage across multiple zones in the eastern U.S. downed numerous websites, including Foursquare, Quora, and Reddit. As one industry analyst said at the time, the biggest impact of these outages may be in the cloud's image itself.

"What will take a hit is the image of this technology as being one you can depend on, and that image was critically damaged today," said Rob Enderle of the Enderle Group. "If the outage continues for long, it could set back growth of this service years and permanently kill efforts by many to use this service in the future."

IBM Pulls Plug on "Blue Waters"

By Matthew Dublin

It looks like Big Blue has pulled the plug on the "Blue Waters" project, a Power7-based petaflop class supercomputer that it was contracted to development for the National Center for Supercomputing Applications (NCSA) at the University of Illinois.

Both the NCSA and IBM have jointly stated that the project was killed because "The innovative technology that IBM ultimately developed was more complex and required significantly increased financial and technical support by IBM beyond its original expectations. NCSA and IBM worked closely on various proposals to retain IBM's participation in the project but could not come to a mutually agreed-on plan concerning the path forward."

Certainly not a good sign for exascale computing if this petascale project proved too much for IBM and the NCSA...

The project, which kicked off in 2007, had the goal of building a sustained-petaflop supercomputer, which obviously required considerable innovation on the part of IBM in terms of not only processing capability, but more importantly, power efficiency. The electric bills that would be generated from a petaflop supercomputer being run at capacity or at near capacity even during just "regular" business hours would be astronomical at best.

The NCSA has stated that it hasn't given up hope on the delivery of a feasible petaflop supercomputer yet. The site has already tapped the National Science Foundation to ensure that the NCSA's efforts aren't in vain and that a Blue Waters-like system could still be delivered.

Parallelizing Code in the Cloud

By Matthew Dublin

Dutch company Vector Fabrics has released a new cloud-based tool they're claiming will allow developers writing HPC or scientific computing software to easily parallelize code for multicore x86 architectures.

The vfThreaded-x86 cloud-based tool has the ability to examine cache hit/miss effects, data bandwidth to memories and bandwidth between individual code sections. The tool can be accessed through the Vector Fabrics website with a standard web browser while the development tool itself runs on Amazon's EC2 cloud.

Selling Open Source Software

By Matthew Dublin

While the community for distributed computing platforms like Hadoop are certainly robust enough to provide adopters with lots of support and applications, that doesn't mean the actual installation and management of open source computing platforms are necessarily a breeze.

In a sign of what may be a growing trend of vendors selling hardware and support with free software pre-installed, Dell and Cloudera announced yesterday the rollout of their Apache Hadoop solution that combines Dell servers and networking components with Cloudera's Distribution Including Apache Hadoop (CDH). The supposedly no-headache combo offering will include management tools, training, and tech support for a Hadoop package comprised of CDH, Dell Crowbar software, and Cloudera Enterprise with a Dell PowerEdge C2100 server and PowerConnect 6248 48-port Gigabit Ethernet Layer 3 switch. The glue to the package is Dell's Crowbar software, which promised to manages the Hadoop deployment from the initial server boot through the configuration of the primary Hadoop components in a seamless way.

In late July, Dell announced the release of another open source platform called OpenStack in a hardware package that implements the cloud computing software on its PowerEdge C servers, which they claim are specifically designed for hosting clouds — whatever that means.

Commercial offerings that include open source software always beg the question of why a site or user would want to shell out for free software, but as Derrick Harris over at Gigaom writes, it's not really a tough sell. Some Hadoop and OpenStack users already pay money for commercial versions of the software with support, so why wouldn't they pay for hardware with the software already installed?

The only argument against that logic would be if you're not ready or don't want to purchase a whole new fleet or servers, but still want to kick the tires on one of these open source platforms to see whether or not it will suit your specific needs. Either way, Harris predicts that more commercial vendors are probably going to follow suit, offering their hardware solutions with open source software and that ultimately it will be a viable business model, as scale-out infrastructure isn't everyone's cup of tea.

Amazon Adds New Features to Its Cloud

By Matthew Dublin

Amazon Web Services has announced today the addition of three new features to its cloud service:

1. VPC Everywhere: AWS' Virtual Private Cloud (VPC) — a service that lets users provision private, isolated sections of the AWS cloud where users can launch AWS resources in a virtual network — is now generally available to all customers with an AWS account and can now be used in multiple "Availability Zones" of every AWS Region. VPCs can now span multiple Availability Zones, and each AWS account can now create multiple VPCs.

2. AWS Direct Connect: This allows for enterprises to create a connection to an AWS Region via dedicated 1 Gbit and 10 Gbit network circuits in order to enhance privacy and reduce network latency. This service is currently only located at Equinix's Ashburn, Virginia colocation facility but users can connect through this location to access AWS in the US-East region. Efforts are already underway to add additional AWS Direct Connect locations are planned for San Jose, Los Angeles, London, Tokyo, and Singapore in the next few months.

3. Identity Federation: This new feature allows users to access AWS APIs and resources using fine-grained access controls without the need to create an AWS Identity and Access Management user for each identity. Applications can request temporary security credentials comprised of access keys and sessions tokens that can last anywhere from one hour to 36 hours before they become invalid. There is also no limit of the number of temporary credentials that can be issued.

All of these new features — which should significantly improve security — could help strengthen the case for personal genomics data on the cloud in a healthcare or research scenario, as well as for pharmaceutical companies interested in using a "public" cloud like AWS' cloud.