BOSTON (GenomeWeb) – At the Bio-IT World Conference and Expo this week, SQream Technologies, a developer of databases for big data analytics, announced the launch GenomeStack, its first product for the life sciences market.
The company will be offering GenomeStack demonstrations during Bio-IT, which is being held here this week.
The newest addition to SQream's portfolio uses the same underlying technology as its flagship product SQream DB — launched last year — which the company describes as a high-speed graphic processing unit-based columnar SQL database that was designed to address the speed, scalability, and efficiency challenges associated with big data analytics in industries such as finance, healthcare, and cybersecurity.
SQream already supplies systems to clients in those domains, and now, through GenomeStack, It intends to bring its infrastructure to the life sciences, and more specifically to the genomics, domain, which also grapples with large quantities of data. The database infrastructure coupled with GPU-optimized code enables users to analyze and query large numbers of sequences scalably and efficiently. It provides researchers with "an efficient set of tools in one solution for storing and analyzing more data at speeds never before available," CEO Ami Gal, said in a statement.
Gal told GenomeWeb that the company spent the last year working on GenomeStack with several experts in the genomics field to design a product that best fit the needs of the space. These development partners, he said, were interested in tools that allowed them to compare a large number of samples simultaneously and offered an affordable alternative to traditional and often time-consuming file-based storage technologies or manual search systems. These clients also sought faster, easy-to-use systems that could scale as the quantities of data they worked with increased.
GenomeStack addresses those needs and also simplifies the analysis process, according to Gal.
To use GenomeStack, users upload their BAM files into structured SQL tables in the database along with data from other repositories such as 1,000 Genomes, dbSNP, or the UCSC genome browser. The database can hold up to 100 terabytes of raw data, according to SQream. With a few clicks, "you get all your genome samples in one database and you can start manipulating and comparing thousands of samples in parallel," Gal said.
In addition to the general SQream database infrastructure, GenomeStack includes tools specifically for use in genomic analysis, including software for visualizing queries, a scripting tool for writing queries to run on data, and application programming interfaces for integrating and running third-party analysis applications on the data. Users can zoom in on sequences and explore nucleotide distribution at specific chromosome positions for all samples. These results can then be downloaded in CSV or txt formats for further manipulation and analysis.
The primary target market for SQream's GenomeStack is genetic research centers and universities that are running large-scale genome analysis projects, but it could also be used by individual researchers, Gal said. For institutions, pricing for GenomeStack is based on the amount of data that a center expects to store and analyze using the system. Prices start between $50,000 to $100,000. For individual researchers, the company plans to eventually offer a cloud-based version of GenomeStack and charge for use per research project or based on the amount of time.
So far, GenomeStack is being used in at least one unnamed cancer research center and in two other research centers, one in a university and another in a large hospital — these are all located in Israel, where the company is headquartered.