It’s been a productive couple of weeks for Softberry. The Mount Kisco, NY-based bioinformatics startup completed an annotation of the mouse genome draft now available through the University of California, Santa Cruz, Genome Browser and signed an exclusive distribution deal with CTC Laboratory Systems to market its product line in Japan.
Although Softberry has already secured several customers in the Japanese market through its own efforts, Valery Sagitov, the company’s president and founder, said the revenue-sharing distribution agreement would give Softberry “greatly increased share of the Japanese market in our overall sales.”
Sagitov said that CTC was strongly recommended by Softberry’s partner Lion Bioscience, who turned to the Tokyo-based firm to market its ArrayScout and BioScout products in Japan.
Softberry doesn’t spend much on advertising, Sagitov said. Instead, it relies on projects such as the UCSC mouse annotation to raise its visibility. The company also allows potential commercial users to test its products at its website before acquiring a license. Academic users have free access to Softberry’s tools.
Making its mouse annotation freely available on the UCSC browser (http://genome.ucsc.edu) may lose Softberry a few customers who might have otherwise purchased the tool to do it themselves, Sagitov acknowledged, “but the fact that people can use our data and verify it goes a long way in building the reputation of our company.”
Softberry used its FGENESH++ software, a combination of several gene prediction tools, to find 35,650 genes in the mouse genome, of which 23,508 have homology with Refseq mRNAs or proteins from the non-redundant database, according to the company. In addition, Sagitov said, because the UCSC browser places the company’s results in the broader context of Genscan and Ensembl predictions as well as mRNA mapping and similar genes mapped from different organisms, “all that independent data can be used to judge the quality of our predictions.”
Genscan, for example, predicted about 101,000 genes, “of which a large fraction is, in our opinion, false positive predictions,” said Sagitov.
FGENESH++ first performs ab initio gene prediction using the hidden Markov model-based FGENESH program, then maps known EST/mRNA sequences from RefSeq using the ESTS_MAP program, and finally compares the homology of amino acid sequences of predicted gene products with a database of known proteins. The mouse genome took about three days to annotate on a Compaq Alpha processor, Sagitov said.
Softberry uses FGENESH++ for human genome annotation as new draft releases become available, and predicted human genes can be viewed in the company’s own genome browser at www.softberry. com/berry.phtml?topic=chrvis. The company expects to release a map of mouse-human syntenic regions on its website by the end of March.