Researchers at the San Diego Supercomputer Center are increasingly dealing with more and more genomics data as they move from being strictly a supercomputing center to a "big data" environment.
SDSC's Gordon system, a unique flash memory-based supercomputer, is capable of storing 100,000 entire human genomes and can analyze genome datasets hundreds of times faster than conventional computers with spinning disk drives.
The iDASH center — a collaborative computational environment developed by the University of California, San Diego to improve access to health data and software — is supported by SDSC. iDASH, the most recent National Center for Biomedical Computing funded under the National Institutes of Health Roadmap for Bioinformatics and Computational Biology, provides biomedical researchers with access to a secure privacy-preserving infrastructure to analyze their data, as well as potentially reuse data from others, and leverage other research results.
SDSC also has a number of collaborations with the Scripps Translational Science Institute, which has a dedicated 1gigabit-per-second (Gb/s) network connection to the center and 140 terabytes of online storage.
One such collaboration is the Human Tumor Study that uses SDSC's Triton Resource to search for genome variants between blood and tumor tissue. Some of the software used in this project includes the Genome Analysis Toolkit, the SOAPdenovo assembler, and various aligners such as ATAC, BLAT, and BWA.
Wayne Pfeiffer, a distinguished scientist at SDSC, is participating in a collaboration called W115 that aims to study the full genome sequence of a 115 year-old woman. Pfeiffer and his colleagues are using popular assembly software, including Velvet and ABySS, in SDSC's systems to rapidly run analyses.