Skip to main content
Premium Trial:

Request an Annual Quote

Pistoia Launches Phase Two of Sequence Services Project; Emphasizes NGS Data Analysis, Management


By Uduak Grace Thomas

The Pistoia Alliance kicked off the second phase of its sequence-services project last week with a challenge to vendors to build a fully functional platform for next-generation sequence data storage and analysis.

The public/private consortium, which was founded in 2007 with the aim of improving pharmaceutical R&D informatics, last week issued a detailed request for proposals outlining its requirements for a hosted platform for sequence storage and analysis.

Organizers also held a conference call to discuss the project, which was attended by representatives from pharma and several software vendors.

During the call, Simon Thornber, a business consultant for GlaxoSmithKline and project leader for the sequence services team, said Phase 2 builds on the back of Phase 1.

Wrapped up in the spring, Phase 1 focused on "non-functional" requirements in order to more generally assess whether third-party suppliers could provide a secure environment to host sequence data and analysis applications.

The results of this phase were presented at a meeting the consortium held in Boston in April. There, six vendors displayed four cloud-based platforms as proof that they could provide pharma with secure access to genomic information stored in public databases, and the ability to upload and analyze their proprietary data (BI 04/22/2011).

In Phase 1, which began in 2010, participants were tasked to develop secure infrastructure that could deliver good performance, scalability, and availability; perform Blast queries on datasets; and host secure installations of Ensembl and PlasMappper, an application that generates and annotates plasmid maps using plasmid DNA sequence as input

The emphasis for Phase 2 is to use open-source tools for NGS data analysis to develop "functionalities" while simultaneously maintaining the security and scalability of the platform, Thornber said.

In addition to providing a written response to the RFP together with a presentation, interested vendors must provide business models that show a commitment to support and develop the proposed systems as well as commercialization plans and pricing models.

Thornber told BioInform that groups who did not participate in the first phase can submit proposals for the second phase, although their platforms would have to meet the security and data-sharing standards of the first round.

Interested vendors are expected to turn in their proposals by close of business on Sept. 23, GMT.

As part of the second phase, Pistoia is offering "shared risk" funding of $50,000 for up to five proposals. The funding, provided by GlaxoSmithKline, AstraZeneca, Roche, Novartis, and Lundbeck, will be based on the response to the RFP, the depth of development, and the innovative nature of the approach, Thornber said.

Proposals chosen to receive funding will be notified in October.


The Phase 2 RFP asks participants to provide systems for NGS data analysis and storage that "demonstrate their ability to support scientists' needs for sequence analysis and in doing so demonstrate to users and suppliers the feasibility of moving such systems from inside companies to an external, shared, service provider."

The alliance requires submitted systems to be as "near production" as possible and for non-expert informaticians to be able to use with real data.

Among other capabilities, the platforms are also expected to enable researchers to align proprietary sequences to publicly available data in tools like Ensembl; provide a gene alias search that uses public aliases and lists that are unique to each company; and an RNA-seq pipeline that provides tools to align short reads to a reference and tabulate read counts and RPKM values (reads per kilobase of exon model per million mapped reads) against each gene.

Systems should also provide access to several well-known bioinformatics tools, including EMBOSS, Clustal-W, SAMtools, Bowtie, Tophat, among others.

According to Thornber, a "Phase 2-ready" system would allow a user to assign data to specific projects and provide simple analysis pipelines.

Furthermore, platforms should be able to expand in order to incorporate capabilities listed in the RFP that vendors may not have time to include in the first incarnation of the tool.

While the RFP goes into significant detail about system requirements, the sequence-services team doesn't expect vendors’ platforms to address every point, Thornber said.

The team expects to begin testing systems' security and functionality in the second week of March 2012 and present final pilots at the next Pistoia conference, which is to be held the following month.

Pistoia plans to launch a third phase of the project next year in which it will include commercial tools and algorithms in the platforms, Thornber said.

Have topics you'd like to see covered in BioInform? Contact the editor at uthomas [at] genomeweb [.] com.

The Scan

Mosquitos Genetically Modified to Prevent Malaria Spread

A gene drive approach could be used to render mosquitos unable to spread malaria, researchers report in Science Advances.

Gut Microbiomes Allow Bears to Grow to Similar Sizes Despite Differing Diets

Researchers in Scientific Reports find that the makeup of brown bears' gut microbiomes allows them to reach similar sizes even when feasting on different foods.

Finding Safe Harbor in the Human Genome

In Genome Biology, researchers present a new approach to identify genomic safe harbors where transgenes can be expressed without affecting host cell function.

New Data Point to Nuanced Relationship Between Major Depression, Bipolar Disorder

Lund University researchers in JAMA Psychiatry uncover overlapping genetic liabilities for major depression and bipolar disorder.