Accelrys plans to launch the next version of its NGS collection in the coming months that will include additional tools for de novo assembly and will incorporate Breakdancer for detecting structural rearrangements and large scale insertions, deletions, and inversions, according to a company official.
Clifford Baron, the company's director of product marketing, told BioInform about the new version this week during the company's North American users' meeting in Jersey City, NJ.
Accelrys released the NGS Collection for its Pipeline Pilot platform in February, joining what has been a steady stream of informatics companies that have entered the next-generation sequencing market recently (BI 02/25/2011).
About a month later, the shop penned a deal with sequencing newcomer Oxford Nanopore to develop software that enables real-time analysis of experimental data produced by the firm's GridIon single-molecule-analysis sequencer.
The companies said they would develop "nanopore-specific" workflows for Accelrys's Pipeline Pilot that will enable secondary and higher-level data analysis for Oxford's Nanopore DNA sequencing system (BI 03/18/2011).
At the user meeting, BioInform caught up with Richard Carter, Oxford Nanopore's senior scientist for applications and bioinformatics.
He discussed why the company tapped Accelrys as its preferred informatics partner and talked about the software the partners plan to develop.
Below is an edited version of that conversation.
Why partner with Accelrys, which is a newcomer to the next-generation sequencing market, rather than working with companies that are more established in the space?
They are newcomers to the field but they are not newcomers to data analysis. We saw it as a company that’s very capable of producing flexible systems that adapt very rapidly.
Next-generation sequencing changes more rapidly than anything else, and a lot of the competing systems tie you into a much more rigid framework and a lot of them are principally graphical user interfaces for bioinformaticians.
We want to expose a broader range of customers to the analysis of next-generation sequencing. A good way of doing that was with the flexibility of Pipeline Pilot. You don’t need to be an expert in computational technique in order to use Pipeline Pilot, especially [with] some of the innovations they are talking about now; they are going to make it more web-based and less pipeline-based.
At the moment, they are not talking about anything for NGS, but that’s somewhere in the future. You can get people just dragging and dropping components and building seriously advanced analysis pathways, which is what we want. We want the customers to be able to produce the maximum value from our machines and that means answering biological questions without necessarily having to deal with all the research necessary to get up to speed with the bioinformatics.
The statement Oxford released announcing the partnership with Accelrys mentioned that both companies would work together to develop 'nanopore-specific workflows.' Can you elaborate on that?
Essentially, it's workflows designed specifically for the platform. Because we are producing real-time data — [that is] as [the platform reads] the bases, you'll get the bases and the quality scores straight off the machine — you can then feed that into an analysis platform (that's in Pipeline Pilot), and do real-time analysis.
The real-time analysis will allow you to say, 'Have I got 30x coverage over all of my genes?' If the answer to that is 'yes,' you can feed that back into the machine and say, 'OK, I have sufficient data so now I can stop the machine.' That is a unique selling point because there is no delay in running the experiment till the end point and then starting the analysis. The analysis actually controls the end point. If you can define the question within Pipeline Pilot, you can make that a stop condition for the experiment.
[For example,] you can make it I want to find 1,000 novel single nucleotide polymorphisms in this list of genes, so when I have found 1,000 novel SNPs, stop the experiment. Or potentially you can say, 'I want to run a condition that favors certain types of reads until I get this much data then alter the experimental conditions and run it under a condition that favors a different type of reads,' so you might want so much data at shorter read lengths and so much data at longer read lengths.
Those are the sorts of things that the nanopore-specific technology, [which it] combination [with] Pipeline Pilot and Oxford Nanopore's GridIon system, will give you.
What are some of the unique features of the data that’s coming out of your sequencer that sets it apart?
I can't give any specifics on data categories like read length and quality. But it's real time; as the bases are being read, we have them available for analysis.
Right off the bat, what sort of analysis capabilities are you and Accelrys looking to provide customers with?
It's pretty much at the moment standard DNA sequencing. A predominant component of the GridIon system is DNA sequencing. We have other applications that we have developed for the GridIon system, which we haven't built analysis applications for within Pipeline Pilot, we have protein sensing, small-molecule detection, those sorts of things that we can produce within the same system. It will be different cartridges but it will be the same basic platform that can do that.
At the moment, we are focusing all our data analysis efforts on the lead application, which is DNA sequencing based on the normal remit that you would use DNA sequencing for [such things as] genome analysis, resequencing, de novo [assembly], and all [those] sorts of things.
One of the things that was a positive for Accelrys was that you can potentially go much further in [the analysis] within the Pipeline Pilot. So you can go quite easily from SNPs to genes. We have written protocols where you can say, 'Show me all non-synonymous SNPs within genes of interest,' and you can then say this is the effect of this SNP in this genome and potentially start linking into much further and richer information that’s available within the family of Pipeline Pilot collections.
Pipeline Pilot is used to handling data flows from multiple sources. They have a history in microarrays [and] image analysis. ... 'From here how do I get on to asking real biology of my data?' That’s where we want our users to be, not just producing next-generation sequencing data but actually posing, setting the success criteria, and getting the answers.
Are you developing any software in house?
All of the on-machine analysis is being written in-house. We have a team of software developers that will make sure that the signal from the nanopore is converted into the final results with very fast turnaround.
We are also writing some specific Oxford Nanopore components that will essentially be instrument control. So there will be a component that says start this run, stop this run, alter these conditions, which would only be called from within Pipeline Pilot and only talk to GridIon system. The Pipeline Pilot node interface is enabled within Pipeline Pilot.
Are you looking to partner with any other software companies?
We have a philosophy that we are building the platform and we are looking at solutions that will help our customers to get the maximum value from their experiments. For sequence analysis, we have partnered with Accelrys and will distribute Pipeline Pilot as our preferred analysis suite. We are looking at partnering in other areas, but we are not giving specifics until we have solutions in place.