Illumina this week introduced a cloud-based analysis environment for its MiSeq personal sequencer system and said it plans eventually to extend to its high-throughput HiSeq sequencer and its other genomic analysis platforms.
The offering, called BaseSpace, is built on the Amazon Web Services platform and will provide free data-management, archiving, analysis, and storage tools for MiSeq users.
With these capabilities in hand, MiSeq users will be able to upload their data to BaseSpace directly following sequencing, and analyze and share their results with collaborators within the cloud's infrastructure.
Illumina CEO Jay Flatley said in a statement that the addition of cloud power to MiSeq's workflow "eliminates the need for expensive IT infrastructure" and simplifies "the process of adopting a personal sequencer for labs of any size and experience."
Illumina said it also plans to extend cloud support to other sequencing platforms in its portfolio, beginning with its HiSeq 2000 sequencers as early as next year, and eventually to its microarray and PCR platforms, company representatives told BioInform.
MiSeq users won't be able to access the BaseSpace resource until Oct. 30, when their systems will be automatically linked to the cloud, Illumina officials said.
That will kick off six months worth of beta testing, after which BaseSpace will continue to be free for MiSeq users while other customers will be able to choose between a basic free service or a broader array of tools and services that will be available for an as-yet undisclosed fee, said Alex Dickinson, Illumina's senior vice president.
Currently, customers interested in BaseSpace can sign up for accounts and explore a series of datasets from MiSeq runs available on the cloud, Dickinson told BioInform.
Additionally, Illumina is giving early access to the complete tool suite to select MiSeq users whose sequencers are already linked to BaseSpace. The company hasn't released the names of those customers.
Familiar Territory
In its first incarnation, BaseSpace will have the same analysis capabilities that are currently available on the MiSeq Benchtop sequencer, Jordan Stockton, Illumina's associate director of product marketing, told BioInform.
These include workflows for genome resequencing, targeted resequencing, small RNA sequencing, library quality control, 16S metagenomics, and de novo assembly.
This way, "the customer has the choice," Dickinson said. "They can either run the MiSeq and keep their data in the way they've normally done, or the data can be uploaded to the cloud and ... that workflow will be kicked off."
Future versions of BaseSpace will include commonly used open-source bioinformatics software as well as packages from other companies, Stockton said.
Illumina said it also plans to include some of its internally developed software into BaseSpace. For example, Stockton said, the environment could contain some variation of the capabilities available in its Consensus Assessment of Sequence and Variation, or CASAVA, software package.
Additionally, it will provide a series of application programming interfaces that can be used to build third-party applications.
Already, several academic and corporate open-source groups — some of whom offer software that could compete with BaseSpace — develop applications for Illumina data.
Illumina views BaseSpace as an "extension" of its product portfolio, Dickinson told BioInform.
"We have been working for some time on making the workflow for the customer as seamless as possible," he said. Being able to "roll [BaseSpace] into our overall product offering just makes sense for us."
Illumina said it BaseSpace won't compete with its bioinformatics partners, but rather will complement them because it provides a platform that could give their solutions a boost in the marketplace.
"When you look at the Illumina partners, what makes their offering unique is [it's] either algorithmic or a workflow that biologists like or an interface and visualization that biologists can take advantage of," Stockton explained. BaseSpace provides an opportunity to "redeploy those [solutions] in a way that might actually broaden the accessibility of those tools."
Dickinson added that Illumina would be open to working with any bioinformatics players who would like to move their software to its cloud.
"The amount of data that is going to be rapidly flowing into BaseSpace is going to make it a very compelling location for any bioinformatics vendor to open up a storefront," he said.
While the company's cloud offering could be a boon for third-party software vendors, it does stand to compete with other cloud-based analysis services, such as DNAnexus, which this week announced plans to ramp up its development with $15 million in financing from several investors, including Google Ventures (see story, this issue).
Illumina also joins other sequencing companies in providing a cloud-based analysis option for its customers.
Last month, Pacific Biosciences announced that it is partnering with Cycle Computing to create a cloud-enabled version of the Single Molecule Real Time Analysis software, PacBio's open source analysis software suite for its single-molecule sequencing system (BI 9/23/2011).
Meanwhile, Life Technologies now offers a cloud-based option as one of three configurations for the LifeScope Genomics analysis software for its 5500 SOLiD sequencer (BI 5/27/2011).
A 'Smooth Experience'
Illumina indicated last year that it was exploring cloud architecture as one of a number of efforts to improve data management for high-throughput genomics analysis.
At the time, the company said it was experimenting with cloud-based solutions internally and in partnership with academic researchers, and that it was considering platforms from Amazon, Microsoft Azure, and IBM (BI 2/12/2010).
Ultimately, the company selected Amazon because of its "scalability" and its "capacity, service, and security," Dickinson said.
Illumina already offers some computational and storage infrastructure to clients through its IlluminaCompute program, which targets smaller labs without in-house IT resources.
However, BaseSpace "gives us an opportunity to deploy applications to a wider net of customers" because "its much more feasible to deploy targeted applications for individual verticals in the cloud," Stockton said.
Illumina chose MiSeq as the first beneficiary of the cloud computing platform because the data size problem is "much easier to deal with" than with HiSeq, he said. In addition, the company believes the offering will ensure a "smooth experience" for users of the Benchtop sequencer, who are likely to be newcomers to next-generation sequencing.
BaseSpace support for HiSeq, planned for next year, will likely include some of the same analysis tools as for MiSeq, but will require some variations because of differences in the user base and user requirements as well as improvements to help it handle larger amounts of data, Dickinson said.
"On the high end, a dataset coming out of a MiSeq is about 2 gigabytes," he explained. Meanwhile "a dataset coming out of a HiSeq is around 600 gigabytes on the high end ... so [we have to work] out the mechanisms required for taking the same infrastructure ... and supporting both of those [sequencers]."
Have topics you'd like to see covered in BioInform? Contact the editor at uthomas [at] genomeweb [.] com.