While it is safe to say that graphics processing units and multicore processors have claimed the limelight in high-performance computing, the field of high-performance reconfigurable computing, or HPRC, has been steadily ramping up its bioinformatics capabilities. The basic concept behind reconfigurable computing, called RC, is that rather than the code being forced onto a GPU or multicore processor (which remains a big challenge for researchers), HRC hardware is tailored to meet the specific design quirks of the code using hardware-design level languages to shape the chip from the ground up. Without a doubt, the star player in the reconfigurable computing world is a processing technology called FPGAs — field programmable gate arrays — offering researchers huge speedups with surprisingly low power consumption.
Biologists have traditionally used FPGAs to accelerate brute-force algorithms, but the arrays have not been that impressive when it comes to the types of algorithms that genomics researchers regularly use. Scott Hauck, a professor at the University of Washington, seeks to change that by putting FPGAs back in the running as the go-to hardware for bioinformaticians. Hauck is working with Seattle-based reconfigurable hardware vendor Pico Computing to develop high-performance computing solutions for the bench biologist. In July, he was awarded $100,000 in Washington Technology Center funding to support the development of parallel solutions for DNA sequencing using FPGA-based logic systems.
"For the genomics community, we are looking to deploy complete, relevant applications that will help provide the computing support to complement current and future high-throughput DNA sequencing systems," Hauck says. "Right now, we are working on the short-read mapping problem, and this can help speed applications such as ChIP-seq, RNA-seq, and detection of the genetic basis of disease and help move towards the $1,000 human genome and personalized medicine. Moving forward, we plan to continue ... to create a complete, high-performance human genome resequencing flow that can run two to three orders of magnitude faster than current practice as well as develop approaches for searching for non-coding RNA structures."
Last year, Pico Computing breathed new life into the role of FPGAs in genomics when the firm announced it had used a cluster of 112 commodity FPGAs fit into a 4U server chassis to achieve more than a 5,000-fold speedup of a bioinformatics sequence analysis and dot plot or matrix plot algorithm using less than 300 watts of power. The FPGA-enabled version of dot plot was written in the C programming language, using only a single FPGA attached to a laptop. Convey Computer, a hardware maker specializing in hybrid-core systems that incorporate FPGAs along with multicore processors, announced back in May that the Virginia Bioinformatics Institute at Virginia Tech had adopted its systems for the 1,000 Genomes project as well as text mining and policy informatics work. The University of South Carolina's Heterogeneous and Reconfigurable Computing Group has also adopted Convey's hybrid-FPGA solutions to explore the acceleration of phylogenetic inference methods. In November, Convey released its HC-1ex hybrid rack system, said to be capable of running an optimized version of Smith-Waterman that is 401 times faster than speeds typically achieved on standard x86 CPUs.
Efforts to move FPGAs beyond the role of co-processor and into a full-scale computing solution for biology — an area of research referred to as "BioRC" — are already underway. The Novo-G machine is a modest-sized cluster that relies solely on FPGAs for the majority of its computational muscle and it is about to be put to use for bioinformatics as well as a range of other scientific applications. Housed at the University of Florida's Center for High-Performance Reconfigurable Computing, or CHREC, Novo-G is the flagship effort meant to showcase reconfigurable computing for genetic and molecular dynamics research.
A new field
University of Florida professor and CHREC director Alan George says that BioRC efforts have already caught the attention of several vendors interested in taking a reconfigurable computing approach to their biotechnology devices. CHREC currently has three industry partners, including Monsanto, which is interested in RC for agricultural genomics and proteomics. "They already do a lot of genomics computations with traditional HPC machines, but came to us because they were dissatisfied with the speed of some of those applications," George says. "RC is exciting for them because they can design new types of corn crops and pesticides at an accelerated rate."
Another CHREC partner is Veritomyx, a small company that specializes in reconfigurable computing for high-accuracy protein sequencing and isoform determination from tandem mass spectrometry data. "They're interested in proteomics for cancer diagnosis and isoformics and have come up with a new algorithm to diagnose certain strains of cancer so that treatments can be employed that directly address that particular type of cancer," George adds. "Their new algorithms are so computationally complex that if they were to run them on conventional computers, the patient would die before they could get the answers back."
Novo-G, which was completed in July 2009, is composed of roughly 200 commodity FPGAs and uses CPUs only as helpers — the opposite of how most reconfigurable hardware currently works. According to its developers, Novo-G rivals the world's top supercomputers at a fraction of the energy consumption and cost. So far, CHREC researchers have ported a number of bioinformatics applications to Novo-G, including Smith-Waterman without traceback, the sequence alignment wave front algorithm Needleman-Wunsch, and a metagenomics application Needle-Distance ND, which is an augmentation of Needleman-Wunsch that includes distance calculations.
FPGA developers, like Boston University associate professor Martin Herbordt acknowledge that the GPU and multi-core processing trends of late have certainly stolen the tunder of reconfigurable computing. However, Herbordt maintains that this is mostly due to the money that companies like Intel and Nvidia have put into marketing campaigns. He says that the hype surrounding these technologies has little to do with actual processing power and ease-of-use with regard to programmability for the non-expert when compared to FPGAs.
Unfortunately, the average biologist is still in way over her head when it comes to porting code to this powerful breed of hardware. "It's my goal that FPGAs become more like GPUs in that there is a standardized board and a standardized programming environment, like [Nvidia's] Fermi GPU chip and their Compute Unified Device Architecture, or CUDA, programming environment, because that doesn't exist right now for FPGAs," he says. "Things are getting better, but not standardized. That's keeping FPGAs from being broadly popular, so I'm hoping that in the next couple of years that will turn around — there's really no technological reasons why that can't be the case."