API for Statistical Phylogenetics with HPC

By Matthew Dublin

Researchers at the University of Maryland have developed BEAGLE, an application programming interface and specialized library for high-performance statistical phylogenetic inference that allows existing software packages to make more effective use of available computer hardware including GPUs, CPUs with Streaming SIMD Extensions, and multi-core CPUs via OpenMP.

The team profiled their research in Systematic Biology and write that "a specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future."

BEAGLE is compatible with Mac, Windows, and Linux operating systems. It is freely available for download here.

GPUs for GWAS

By Matthew Dublin

Because the computational burden to search for epistasis in genome-wide association study data is often prohibitive, a team from the Roslin Institute at the University of Edinburgh has attempted a powerful and cheap implementation of a search algorithm on GPUs using OpenCL. The team published a paper in Bioinformatics describing the GPU implementation, which achieved a 92 speed up of an exhaustive epistasis scan for a quantitative phenotype.

In their paper, the authors write that "to achieve a comparable computational improvement without a graphics card would require a large compute-cluster, an option that is often financially non-viable. The implementation presented uses OpenCL—an open-source library designed to run on any commercially available GPU and on any operating system."

Their software, called EpiGPU, is open-source and GPU-vendor independent, meaning that it will run on any GPU card.

It can be downloaded here.

Amazon Rolls out NoSQL Database Service

By Matthew Dublin

Amazon Web Services (AWS) has launched a fully managed NoSQL database service in the cloud called DynamoDB that aims to provide seamless scalability on the fly. AWS is claiming that their new service will offload administrative tasks such as hardware provisioning, setup, configuration, replication, software patching, and cluster scaling.

According to their announcement, "developers can create a database table that can store and retrieve any amount of data, and serve any level of request traffic. DynamoDB automatically spreads the data and traffic for the table over a sufficient number of servers to handle the request capacity specified by the customer and the amount of data stored, while maintaining consistent, fast performance. All data items are stored on Solid State Disks and are automatically replicated across multiple Availability Zones in a Region to provide built-in high availability and data durability."

Amazon's CTO Werner Vogels has a post on his blog discussing the announcement, where he describes DynamoDB as the result of 15 years of "learning" in the areas of large-scale non-relational databases and cloud computing. "Several years ago we published a paper on the details of Amazon’s Dynamo technology, which was one of the first non-relational databases developed at Amazon. The original Dynamo design was based on a core set of strong distributed systems principles resulting in an ultra-scalable and highly reliable database system."

With a NoSQL database there is no strict schema, so data is collapsed into one very fat table where each row stores a huge amount of data. The NoSQL database contains a lot of data redundancy, which means more storage space and computational power is required compared to SQL databases.

AWS might attract customers in genomics with this offering as there have already been several use cases of NoSQL in the cloud for omics research. For example, last October, Monsanto deployed Cloudant's NoSQL database as the foundation of their genomics data analysis system.

DynamoDB users can get started with a free tier account that enables 40 million of requests per month free of charge. Additional request capacity is priced at cost-efficiently hourly rates as low as $.01 per hour for 10 units of Write Capacity or 50 strongly consistent units of Read Capacity, with replicated solid state disk storage at $1 per GB per month.

GPU-Based Cluster Aids Nanocarrier Simulations

By Matthew Dublin

A team at the University of Illinois at Chicago are using both traditional and GPU-based clusters at the National Center for Supercomputing Application (NCSA) to study nanocarriers. Like an empty bullet casing, nanocarriers could prove to provide a targeted delivery method for drugs needed to kill cancer cells.

The NCSA's clusters enabled the researchers to perform extensive atomistic molecular dynamics simulations of polyethylene glycol (PEG)-ylated phospholipid dendron-based micelles — aggregates of surfactant molecules dispersed in a liquid colloid — in which the micelles are characterized in pure water and ionic solutions.

"Our simulations are massive," says principal investigator Petr Kral. "They have up to 750,000 atoms and they need to be calculated for a relatively long time, up to 30 nanoseconds. That is why the supercomputer was very useful to us and very necessary."

While Kral and his collaborators developed their own GPU-based computer system in their lab, it lacked the power for their simulations they run. Their results were published last year in the Journal of the American Chemical Society and Chemical Communications.

GPU-Accelerated Short Read Aligner

By Matthew Dublin

Researchers from the University of Cambridge and the University College Cork have released BarraCUDA, a GPU-accelerated short read DNA sequence alignment software based on BWA.

The team used Nvidia's Compute Unified Device Architecture (CUDA) to develop the software on a GPU. BarraCUDA demonstrated a throughput six times the speed of a CPU core for gapped alignment and even faster when gap opening is disabled.

They describe BarraCUDA in BMC Research Notes.

According to the team, when it comes to implementing alignment software, multiple GPUs scale better than CPUs. They write that "a normal computer can easily take up 4 GPUs, meaning that using this test library as an example, a single-end alignment can be done in 5 min, which is twice the speed of a high-end 12-core workstation. Using 8X GPU, we can achieve an alignment speed 3X faster than a traditional computing node with 12 CPU cores, making GPU nodes a more favourable option, in terms of HPC environment, than using those with CPUs."

BarraCUDA can be downloaded here.

Highlights from SC11 Part IV

In this video, we speak with Todd Smith, senior leader of research and applications of PerkinElmer, and researchers from Stockholm University on accelerating the GROMACS molecular dynamics software suite with GPUs and CPUs.

More FPGAs for Genomics

By Matthew Dublin

CLC Bio and Sciengines have announced a collaboration to offer users Sciengines' RIVYERA FPGA-based platform with BLASTp, BLASTn, and Smith-Waterman.

The RIVYERA hardware platform allows the BLAST implementation — which is still under development — on 128 FPGAs that can be set up in each compute unit. An early version of this solution will be showcased at the International Plant & Animal Genome (PAG XX) conference on January 14-18 in San Diego.

According to Jost Bissel, chief software architect at Sciengines, the initial results from running CLC's tuned BLASTp on their FPGA solution demonstrated a 188 times speed up using 64 FPGAs when compared to a Xeon core processor. "The benchmark ran BLASTp to align 920000 amino acids against a database of 1 billion amino acids. Similar acceleration has been achieved in early benchmark tests of the BLASTn version, and we expect both BLAST implementations to be accelerated even further before the final release," says Bissel.

Also at the PAG XX conference, Pico Computing will demonstrate their FPGA-accelerated BFAST solution which is running on its M-502 FPGA modules. According to their white paper, this BFAST implementation is 100 times faster than BFAST running in software and ten times faster than Bowtie. The FPGA system maps 92 percent of short reads versus 85 percent of reads mapped for Bowtie. This sensitivity can be further tuned in the FPGA system. In addition, Pico integrated their FPGA system with Geneious Pro's plugin API to create a visualization and analysis interface.

Timelogic, Mitrionics, Convey, and SGI, have also released FPGA BLAST implementations, but it's difficult to really compare FPGA BLAST solutions against each other — every offering out has a unique hardware configuration — but all FPGAs provide impressive results when compared to a CPU.

A Few Hundred Genomes in Your Pocket

By Matthew Dublin

Victorinox, makers of the world-famous Swiss Army Knife, have, surprisingly, been the first to offer a 1TB USB stick — the world's largest thumb drive. The USB stick comes with either just the thumb drive on its own or with a pair of scissors and a knife. The drive can be accessed via USB 2.0 and 3.0 or eSATA, has AES 256-bit ascription, and has a 48 x 96 dot monochrome LCD display with room enough to provide a device label or some indicator of the drive's contents.

The thumb drive, which was exhibited at this week's CES conference in Las Vegas, comes with a price tag of $2,000, so if you're the type of person who is apt to misplace their car keys often, you might want to skip this one.

In theory, you could stick roughly 340 human genomes on this drive — not including annotations and other data of course — which begs the question: Could snail mail as a data transfer method for research collaborations make a comeback? It would be a lot cheaper to send some USBs in a box compared with a crate of disk arrays or hard drives, and possibly quicker than uploading data to the cloud.

These USBs also sport some pretty formidable security. The drive immediately emails its owner if plugged into an unauthorized computer and if no reply is received from the owner, zaps the flash memory and deletes the data.

Small and Smart Storage Boxes for LIMS

By Matthew Dublin

While there are a lot of high-end storage systems marketed to large-scale sequencing operations, it's not every day that you come by specially designed desktop storage units natively hosting scientific data management software for the individual researcher or small lab. BioTeam has hacked a few Drobo storage arrays to embed their own MiniLIMS software directly inside the array. This isn't a product that the BioTeam is offering, but it's an interesting example of successful hacking for research.

Getting LIMS to run as an application inside a storage array — a desktop-sized storage array — is worth taking a look at because, according to BioTeam, in the near future it's not inconceivable that such "smart storage" devices could replace PC-based laboratory instrument operating workstations.

"As storage units get smarter and more capable the need for a dedicated Windows PC attached to an instrument or Genome Sequencer becomes less important….Something like this seems attractive for single-instrument genomics environments or labs where dedicated research IT staff may not be easily available," they wrote on their blog.

The little storage box that could:

Cost-Conscious Cloud-App Development

By Matthew Dublin

Thanks to IBM, you might be hearing a lot more about a cloud computing development platform called Green Hat in the future.

Green Hat, which has actually been in business since 1996, allows software developers to kick the tires and work out the kinks on their cloud software before it actually gets to the cloud.

IBM has purchased the company in order to add its technology to their Rational Software development platform.

A virtual environment simulates a wide range of IT infrastructure configurations and headaches, thereby allowing an institution to bypass painful parts of the software development process. Coding for the cloud can cost money and definitely eats up hours in the lab, even for an experienced programming with an Amazon AWS account.

Getting popular bioinformatics software applications — typically run on clusters or workstations — to operate smoothly on the cloud is still anything but trivial, so a testing environment that allows for some growing pains could prevent lots of frustration and wasted grant money.

Highlights from SC11 Part III

This clip features Virginia Tech's Wu Feng on "green HPC" and GPU computing for life sciences on the HokieSpeed supercomputer. Also included, the Pittsburgh Supercomputing Center's Philip Blood on the Blacklight supercomputer and collaborations with the Broad Institute.

Cloud Computing Aids Semantic Search Startup

By Matthew Dublin

One of the rarely cited benefits of cloud computing is that small bioinformatics startup companies now have a fighting chance to demo and market the kind of ambitious compute solutions which traditionally only more established institutions or companies had the funding to support. Software tools for dealing with big data that previously would have required cost prohibitive IT infrastructures to run can now be spun up on several nodes in the cloud, on the fly and on the cheap. This enables startups to provide working versions of their technology to build interest among investors and the user community until that lucky day when their technology is licensed out by a big institutional or commercial customer.

A small semantic search technology software company based in Brigantine, NJ called Weblib is using the cloud to do just that. I had the chance to chat with Weblib's CEO Tamas Doszkocs, a computer scientist and semantic technology expert who recently retired from the Specialized Information Services Division of the National Library of Medicine, about their use of the cloud.

Back in March, Weblib won a National Library of Medicine software development challenge. "Show off Your Apps: Innovative Uses of NLM Information," for NLMplus, a semantic search and discovery application that utilizes a variety of semantic resources and natural language processing tools to produce improved search results from the collection of biomedical data and services of the National Library of Medicine (NLM).

"We are a tiny company, so it's very hard to compete with the big guys in the semantic search arena, like IBM with their Watson system — those are awesome companies that can put hundreds of developers and unlimited resources on a particular problem," says Doszkocs. "In order to process all these full text biomedical databases, parallel processing is absolutely necessary, and the cheapest way of doing that is renting some time on the cloud. That way, these huge text databases can be broken into chunks to be processed in parallel in a way that's not that costly and takes only days to complete versus weeks on desktop computer."

NLMplus, which features a Google or Bing-like user interface, boasts as its primary innovation a semantic search engine that typically produces relevant search results from 1.6 million PubMed Review articles that are semantically indexed and searched. The NLMplus application also sends conceptually enhanced Boolean queries to NLM’s PubMed system of more than 21 million citations from the literature.

Doskocs says that as their customer base increases, they will also look to using the cloud as an on-demand computational resource for their online retrieval service

Data Dancing on the Ceiling

By Matthew Dublin

In what sounds like an outdated science fiction trope, researchers from the University of California, Santa Barbara, and Intel's labs in Oregon have come up with a way to bounce data off the ceilings of data centers using powerful WiFi technology. Their method increases I/O speeds by some 30 percent.

Heather Zheng of UCSB used 60-gigahertz Wi-Fi, developed for high-definition wireless communications that the rest of us will have to wait until next year to get our hands on.

While the idea of using wireless connections in a large data center isn't new, previous efforts have required that WiFi stations sending and receiving data have a ling-of-sight connection. But by angling the stream of data upwards, bouncing it off the ceiling, signals can be transmitted from area of a densely packed data center to another, enabling nodes to send data to wherever is needed regardless of their geographical location within a center.

Zheng and her team placed flat metal plates on the ceiling that provided perfect reflection in a simulation of a 160-rack data center."Our simulation shows that wireless can add 0.5 terabytes per second," she says.

The team is now working on building a prototype data center to put their solution into practice.

You can read more here.