The days of spending only $1,000 to sequence an entire human genome may be five years or more away, but Synamatix of Kuala Lumpur, Malaysia, is taking steps to establish itself as the data analysis company of choice for next-generation sequencing technologies.
The company recently launched FragBase, a new module in a line of products built upon its “pattern-aware” data storage and management technology. The core of this technology platform, SynaBase, uses pattern-recognition technology to assign “significance” to certain sequence patterns and then stores the relationships between these patterns. Because the database stores each of these patterns only once, in a hierarchical structure, entire genomes can be stored in a much smaller amount of space than in relational databases or flat files, and can therefore be searched much more quickly, according to the company.
Right now, the core application areas for the company’s technology have been in comparative genomics and high-throughput genomic searching, but Synamatix hopes to lay the groundwork for widespread adoption of its methods in future years.
One area that Synamatix has identified as a good fit for its approach is next-generation sequencing methods currently under development at companies like 454, US Genomics, Solexa, and others. These methods promise to greatly speed the sequencing process, but one potential bottleneck is in the assembly of large amounts of sequence data. Truly personalized medicine won’t be practical if physicians require server farms to process their patients’ genome sequence data — regardless of the platform that generated it.
FragBase was developed in collaboration with an undisclosed US company developing a next-generation sequencing platform, according to Arif Anwar, vice president of Synamatix.
It’s been adapted to assemble “very large numbers of very small bits of DNA” — less than 100 base pairs. Current assembly algorithms rely on sequence alignment and are thrown off by repetitive regions. According to Synamatix, FragBase uses pattern recognition to overcome that problem by identifying repeat regions in advance. Contig coverage is assessed based on the company’s “significance” classification.
“People are now moving toward the $100,000 genome in a year or two, and maybe the $1,000 genome will be attainable in four to five years. That would be a good timeline to hit, and we’re ready for that,” Anwar said.
Apparently, Synamatix is not alone in planning ahead for the $1,000 genome. In mid-March, more than 50 people tuned in for a webcast that the company presented on the subject.
Anwar added that the company is reaching out to other developers of high-throughput sequencing systems for potential partnerships. “We’re talking to everyone in this market,” he said. “We’ve made it a real key focus area, so there’s no one we’re not talking to. We’re not going to wait around.”