BOSTON (GenomeWeb) - Annai Systems, a Carlsbad, Calif.-based developer of genomic data management solutions, has unveiled a new product for the life sciences market called Annai-ShareSeq that gives customers in academia and industry access to genomic data, bioinformatics pipelines and workflows, and compute power and storage within a secure cloud-based environment.
The company discussed the solution — which is slated for full launch in the fall of this year — during the Bio-IT World conference in Boston this week. In addition to providing cloud-based hardware for analysis, Annai's ShareSeq offers access to open-source tools such as the Broad Institute's Genome Analysis Toolkit and Bowtie, and it lets users combine these tools into customized analysis workflows, Jay Kaufman, Annai's vice president of marketing and strategy, told BioInform in an interview last week. These users will also be able to upload and run their own proprietary pipelines as well. Furthermore, customers will have access to publicly available datasets — from cancer projects at first — and they'll be able to upload their own data and analyze it in the context of the data already available in ShareSeq, he said. The system also includes AnnaiGNOS, the company's data repository management platform, and its GeneTorrent application, which is used to move data in and out of repositories.
Annai is preparing to launch an early access program in the next few weeks, during which potential clients in academia and industry will have an opportunity to test and give feedback on existing features in the platform as well as some features that are currently not in ShareSeq but could be included in later iterations of the solution. Annai has begun talking to potential testers in the oncology research space — which will be the initial target market for ShareSeq — and the company hopes to have firm agreements in place by May or June, Kaufman said.
Meanwhile, Annai is finalizing an agreement with a yet-to-be-named academic institution that will provide the first datasets to be available through the system. Kaufman said that Annai will offer access to the unprocessed raw data files and normalized data that is ready for customers who want to move directly to biological, pathway, functional, or clinical analysis.
The company is currently mulling the exact price points for access to ShareSeq. Kaufmann told BioInform that the company intends, on the one hand, to sell annual subscriptions for ShareSeq but also to offer a competitively priced pay-per-use option where customers will be charged based on the number of compute hours they use. It is also considering price points for access to the data including whether or not to charge a separate fee or to include that cost as part of the larger pricing structure. Also being considered is whether or not to implement caps on the amount of storage space and the size of personal data that can be uploaded to the system and what the upper limits should be, he said.
ShareSeq is the result of a partnership between Annai and Hitachi Data Systems, according to Kaufman. Last week, Annai said that it had signed an agreement with Hitachi to develop an integrated resource for data management and analysis that combined the Annai-GNOS data management platform with Hitachi's high performance compute and cloud-based solutions — the solution that the release referred to is ShareSeq BioInform learned this week. The company further stated that it would develop and offer different products and services based on the Hitachi Content Platform that would enable researchers involved in cancer studies and other areas access to, and make better use, of genomic data.
Annai is also one of the technology partners involved in the International Cancer Genome Consortium (ICGC), a global effort to sequence cancer samples from 25,000 individuals in order to obtain a comprehensive description of genomic, transcriptomic, and epigenomic changes in 50 different tumor types and subtypes including breast, bladder, eye, brain, lung, renal, and pancreatic cancers. The company is also involved in the Cancer Genomics Hub (CGHub), a petabyte-scale data repository developed and maintained by researchers at the University of California, Santa Cruz, which aims to provide access to genomic and clinical data generated by several projects led by the National Cancer Institute's cancer genome research programs. The CGHub team uses Annai-GNOS to transfer and manage data in the repository.
The company's involvement with the ICGC served as the catalyst for the partnership with Hitachi, according to Kaufman. Annai already has an existing computer cluster called BioCompute farm, which is located near the San Diego Computing Center and provides high performance computing, storage, and networking resources. Annai rents computer resources on this system to customers looking to analyze data from The Cancer Genome Atlas project remotely instead of downloading the datasets to their local servers.
Instead of purchasing more hardware to beef up the BioCompute farm to support the ICGC, Annai opted to partner with Hitachi to offer both the storage capacity and compute power that researchers would need to analyze the data that the project produces, Kaufman said. During a presentation focused on the ICGC at the Bio-IT World conference, Lincoln Stein, the director of OICR's informatics and biocomputing program and of the ICGC's data coordination center, said that the initiative has currently generated about 500 terabytes of data, which he said is only a fraction of the total amount of data it is expected to generate by the time the project wraps.
The partnership allows both companies combine their expertise in data management software and implementing large data repositories, and in cluster-based or cloud-based infrastructure, to offer secure access to storage and analysis resources that are co-located with the data and available to researchers on demand, he said.