Skip to main content
Premium Trial:

Request an Annual Quote

Kyoto University Taps SGI Infrastructure for GenomeNet Database, Bioinformatics Server


By Uduak Grace Thomas

SGI said this week that Kyoto University's Institute for Chemical Research has selected one of its high-performance computing and storage systems to support a network of databases and bioinformatics tools.

Specifically, the system will support GenomeNet — a Japanese network of databases and computational services for genomics and related biomedical research that is operated by the institute's bioinformatics center. The new SGI machine will also support computational chemistry research projects at the university.

Minoru Kanehisa, a professor and director of the bioinformatics center, told BioInform that Kyoto University's purchase replaces an existing SGI system at the institute based on the company’s Altix 4700 line.

The new system is comprised of SGI UV 1000 systems and will operate at 32.6 teraflops with 48 terabytes of memory and 840 TB of storage.

The system includes more than 3,072 Intel Xeon processor E7 series cores and is comprised of two servers — one for computational chemistry and the other to handle GenomeNet’s computational services component and internal genome annotation efforts.

The GenomeNet server will include two nodes with 1,024 cores and 16 TB of shared memory. Among other tasks, that server will be used to compute the “results of sequence similarities among all genes in all available genomes [stored in the KEGG SSDB database] using the SSEARCH program,” Kanehisa said.

The computational chemistry server, which will be used for applications such as quantum chemistry and molecular dynamics, will contain two nodes with 512 cores and 8 TB of shared memory.

Kanehisa said the new system would reduce data processing times in GenomeNet while making it easier and faster to give users the data they need.

'Essentially a Massive PC'

SGI launched its Altix UV line in 2009, offering customers a choice between the UV 1000, which ships as an integrated cabinet-level solution with up to 256 sockets (2,048 cores) and 16 TB of shared memory in four racks; and the UV 100, which scales to 96 sockets (768 cores) and provides 6 TB of shared memory in two racks.

In the last year, the company has sold more than 300 UV systems, about 20 percent of which have been to groups within the life sciences, Eng Lim Goh, SGI's chief technology officer, told BioInform.

Within life sciences, SGIs UV systems are used in drug companies, sequencing centers, and academic research institutes. Outside the space, SGI's hardware is being used in engineering, fluid dynamics, and signal processing, among other industries.

Goh said that SGI's UV system improves on traditional compute clusters or cloud infrastructure because it provides customers with a way to explore their genomic data as a whole rather than in a distributed fashion where the data is broken up into smaller pieces that are spread out over different nodes for analysis.

With increasing output from high-throughput sequencing platforms, it has become increasingly difficult for researchers to manage that data, let alone do their analysis, he said.

Typical cluster-based systems "combine multiple computers together to crunch this data [and do things like] alignment, matching, and correlation" in genomics and proteomics studies, he said while SGI’s UV systems help life science researchers look at their data in a "monolithic way," which improves productivity and is more convenient.

The UV "is essentially a massive [personal computer] that has terabytes of memory and ... thousands of cores that’s really treated as one PC" that can handle these large quantities of data, Goh said.

With its cohesive approach to data analysis, the UV system is useful for analyzing whole genomes or looking for correlations between multiple genomes and protein maps, he said.

Furthermore, the system includes two tiers of storage — one of which is an embedded "sleeping disk" that, according to Goh, offers a cheaper option than tape-based storage for holding data in the long term.

The entire system is wrapped in a single piece of software that moves data into the system’s secondary storage system and wraps both tiers in such a way that the data appears to be held in a single disk drive.

Although Goh stressed that the system is simple to use, customers need to have an IT person trained in Linux to set up and maintain the machine.

Have topics you'd like to see covered in BioInform? Contact the editor at uthomas [at] genomeweb [.] com.