Skip to main content
Premium Trial:

Request an Annual Quote

GnuBio Targets 1K-Base Reads for Commercial Launch of $50K Sequencer

Premium

By Monica Heger

This story was originally published Oct. 3.

GnuBio's microfluidics-based sequencing system will have read lengths of 1,000 base pairs when it launches broadly in the second quarter of 2012, the company's vice president of informatics John Healy said last week at the NGX conference in Providence, RI.

Healy also said that City of Hope Hospital has joined its roster of early-access customers, which also includes the Montreal Heart Institute. In a pilot project with City of Hope, the company is sequencing two exons encompassing 477 base pairs on the TP53 tumor suppressor gene.

Healy said that the company will initially focus on targeted sequencing when it launches the system next year, and will then move into transcriptome sequencing and whole-genome sequencing, but declined to disclose the timeline for those applications.

GnuBio plans to price its system at $50,000 and the consumables costs for individual projects will vary. For example, the cartridge for a 50-gene targeted sequencing project, which includes all the necessary reagents and analysis, will cost a customer between $50 and $70, said Healy. The turnaround time for such a project would be around three hours.

GnuBio's system is based on microfluidics technology developed in co-founder David Weitz's physics laboratory at Harvard University. A user injects genomic DNA into the system, loads a cartridge containing the necessary reagents, and then "hits go and walks away," Healy said.

The system was designed "from the ground up" to meet three main criteria: "minimal sample prep, low cost, and rapid turnaround time," he added.

After the user hits 'go' on the machine, the genomic DNA is first sheared into 1-kilobase sized fragments. Then a "picoinjector" — essentially a pipette about a quarter the width of a human hair — injects amplified DNA into microdroplets that contain a pair of PCR primers that represent the amplicon to be sequenced.

A second picoinjection transfers the enriched DNA into sequencing probe droplets, which are barcoded with fluorescent-labeled hexamer oligonucleotides pulled from a library that contains all possible base combinations — 4,096. The company has also designed a collection of longer probes to capture long homopolymer stretches and di- and tri-nucleotide repeats, for a total of around 5,000 probes.

The platform has an optical readout, and the analysis is all done within the machine. After sequencing, each amplicon ends up with an "aggregate binding signature," which is used to map it against the reference so its general location can be identified. Then a local reconstruction is done to ensure each base is accurate.

Each read will have both a "positive probe signal" and four "negative probe signals," said Healy, because there will often be an "alternative hypothesis" to the correct sequence. By comparing the negative probes to the positive probe, the correct sequence can be identified.

The system currently has read lengths of 612 base pairs but the company expects the commercial version to reach the 1,000 base pair mark. Additionally, Healy said that the company could scale up, eventually achieving read lengths of 4,000 base pairs, by upping the number of oligos it uses to construct its library to 16,384, which it will do by using heptamers instead of hexamers.

Healy said that step would increase the library complexity without increasing detection complexity. Despite using more than 16,000 oligo probes, not all of them would need to be read, he said, because the amplicon can be accurately mapped without measuring the entire library.

For example, one amplicon can be run on four separate channels, with four separate oligonucleotide libraries. Each would end up with its own unique signature, which could be merged downstream to reconstruct the entire amplicon, Healy added.

Targeting Clinical Applications

Test runs of the system have focused on targeted amplicon sequencing. The company has done 100 runs so far on its machine focusing on amplicons with clinical relevance, including the TNNT2 gene, which is associated with cardiomyopathies; tumor suppressors TP53 and PTEN; the KRAS oncogene; and pUC19, a plasmid cloning vector.

Error rates average 0.03 percent, with a minimum of 0.003 percent and a maximum of 0.6 percent. Healy said that the system does not have any systematic bias. It can "measure easily five- to six-base homopolymers" and is "insensitive to GC composition."

One system has already been installed at the Montreal Heart Institute, which has tested it on clinical samples, sequencing amplicons of the TNNT2 gene (IS 7/19/2011). The team chose that gene because a variant that leads to greater risk of cardiomyopathies is common in a subpopulation in Quebec.

The institute tested the machine on seven clinical samples, sequencing a 395-base pair amplicon encompassing two exons of the gene in two samples, and a 30-base pair region surrounding the hot spot of the gene on another five samples.

The team generated over 3,000-fold coverage per sample with 99.4 percent accuracy and 99.96 percent consensus accuracy, and correctly called 100 percent of the variants from all samples. Total sequencing time was seven minutes.

Healy said that the company is designing the platform specifically to minimize turnaround time, including capture and analysis, for targeted sequencing projects, "which we see dominating the clinical space for quite a long time."


Have topics you'd like to see covered by In Sequence? Contact the editor at mheger [at] genomeweb [.] com.