Skip to main content
Premium Trial:

Request an Annual Quote

Synamatix Offers Comparative Genomics Shortcut via Pattern-Aware Database


There are generally two ways to speed whole-genome comparisons: faster algorithms or faster hardware. But Synamatix, a bioinformatics startup based in Kuala Lumpur, Malaysia, says it has developed a new approach to this familiar problem that focuses instead on the database that stores the genomic sequence information.

This database, called SynaBase, sits at the hub of a suite of products that the company is developing to mine the data stored within it. Johan Poole-Johnson, global marketing manager, said the database relies on pattern-recognition technology that assigns “significance” to certain sequence patterns and then stores the relationships between these patterns. Because the database stores each of these patterns only once, in a hierarchical structure, multiple genomes can be added to the system. These genomes would occupy a much smaller amount of space in SynaBase than they would in relational databases or flat files, which retain redundant information for all the raw sequence data across entire genomes, he said.

In addition to the reduced storage requirements, Poole-Johnson said the approach improves upon traditional methods for comparative genomics, which focus on the frequency of certain patterns occurring, rather than their significance. Using the analogy of an English sentence to describe how SynaBase looks at the genetic code, Poole-Johnson said that traditional, similarity-based methods would extract terms like “the,” “an,” and “to,” “because they occur frequently in the language, but they don’t necessarily provide meaning.” SynaBase, by comparison, would be able to hone in on the nouns and verbs that get the speaker’s point across, he said.

SynaBase works in concert with SynaSearch and SynaCompare, the company’s genome comparison and visualization tools, to quickly align long genome sequences, and even entire genomes. Synamatix said it has aligned two bacterial genomes in less than 24 seconds, and the human X and Y chromosomes in under 2.5 minutes. A self-comparison of the whole human genome that took 4.5 days to run on a single CPU with Mummer would take only 44 hours using SynaBase, according to the company.

Henk Heus, vice president of R&D at Gene-IT and an expert in comparative genomics, said that a new architecture for storing sequences “is a good idea — the result is performance and new science that is not achievable with traditional approaches.” However, Heus noted, from the information that the company has made public about its technology, it’s difficult to assess its accuracy, or how well it is able to “coax syntax and meaning out of the patterns.”

According to Poole-Johnson, one demonstration of the company’s technology that proves its pattern-recognition capabilities involves removing all the spaces between the words in a book, and then feeding the book into SynaBase. Once the database identifies the patterns and characters in the book, “we can output those words and their spaces as identical as they were before.”

The 15-person company was founded in 2001 by Robert Hercus, the inventor of the pattern-recognition technology that lies at the heart of SynaBase. Hercus has formed several other firms around the same core algorithms, but Synamatix is the first of these to commercialize its technology, Poole-Johnson said. Funded by “a combination of venture capital and angel funding,” Poole-Johnson said that the company has also received a $700,000 grant from the government of Malaysia to develop and commercialize its technology.

The company’s product suite will eventually contain at least six additional applications, including SynaMine for sequence analysis, SynaStruct for 2D structural information, and SynaPath for pathway data.


Filed under

The Scan

Lung Cancer Response to Checkpoint Inhibitors Reflected in Circulating Tumor DNA

In non-small cell lung cancer patients, researchers find in JCO Precision Oncology that survival benefits after immune checkpoint blockade coincide with a dip in ctDNA levels.

Study Reviews Family, Provider Responses to Rapid Whole-Genome Sequencing Follow-up

Investigators identified in the European Journal of Human Genetics variable follow-up practices after rapid whole-genome sequencing.

BMI-Related Variants Show Age-Related Stability in UK Biobank Participants

Researchers followed body mass index variant stability with genomic structural equation modeling and genome-wide association studies of 40- to 72-year olds in PLOS Genetics.

Genome Sequences Reveal Range Mutations in Induced Pluripotent Stem Cells

Researchers in Nature Genetics detect somatic mutation variation across iPSCs generated from blood or skin fibroblast cell sources, along with selection for BCOR gene mutations.