Golden Helix, a 10-year-old Bozeman, Mt.-based bioinformatics firm, will later this month debut the latest version of its software for SNP and copy-number analysis, according to a company official.
CEO Christophe Lambert said recently that the new package, called SNP & Variation Suite 7, was designed to overcome constraints of its predecessor software package, SVS version 6. Golden Helix launched its first analysis tool, called HelixTree, for association studies, in 1998.
According to Lambert, SVS 7 was designed to overcome data-capacity constraints, limited visualization capabilities, and data-workflow challenges associated with high-density SNP-genotyping arrays and second-generation sequencers.
It will be available as a free upgrade to existing customers in the second half of February.
”Our current product was reaching the end of its lifecycle for a few reasons,” Lambert said during a recent webinar hosted by the company. “The size of genetic analysis has gotten really large, particularly with some of the copy number and genome-wide association studies, and so there were data size limits.
“We also had limited visualization capabilities,” he added. “We had some very specific plots — for example, visualizing P values — but if you wanted to plot complex data, you probably would have had to go somewhere else.”
Lambert told BioArray News last week that, to date, SVS has focused on “industrial strength analysis,” but needed to be adjusted to handle the larger data sets generated by GWAS studies.
“A lot of it was enabling a desktop computer to handle these large data sets while looking down the road to the genome-wide sequencing platforms and the challenges they will make for researchers,” he said.
“We are starting to see some customers doing 10,000-sample GWAS studies that were going over the limits of our software, so we thought if we were going to handle that, we would build [SVS 7] to handle thousands of magnitude more,” he added.
Among the new features included in SVS are interactive visualization and an integrated genome browser; real-time data editing and enrichment; advanced predictive modeling; quality assurance and stratification correction; and enhanced genome-wide association analyses including the ability to use genetic covariates in regression.
“Looking to the future, one thing is clear: Data sizes are not going to get smaller,” said Lambert. “We wanted to make sure that we could scale massively from a data perspective. We have tested it with 40 million columns, for example. We made sure that the underlying architecture lent itself towards future enhancement and integration.”
SVS 7 also enables users to collaborate more easily, according to the firm. “Research is often a collaborative effort, but it is hard to collaborate if one researcher uses our SNP & Variation Suite for his or her analysis, but has a colleague who uses something else,” said Lambert. To deal with this issue, SVS 7 includes SVS Viewer, a “free tool that allows anyone to open SVS projects, play with the data, and manipulate graphs.”
Part of the development process for SVS 7 was the increasing prominence of Golden Helix’s services business within the company. Lambert estimated that as much as a third of the privately held company’s revenues this year might come out of the large projects it runs in its labs.
“We have had these collaboration arrangements and analyzed as a service large-scale and small-scale studies,” Lambert said. “We have also had a dozen or so data test sites where we got tremendous amounts of feedback from our users. In the process of improving our software, we thought about overcoming constraints, and so we were open to meeting their needs best.”
Andy Ferrin, director of sales and marketing at Golden Helix, told BioArray News last week that the company is involved in “several copy number association studies, SNP-association studies, and diagnostics development” via its services business. The company performs analysis on its customers’ datasets. “We can combine SNP, CNV, expression, and clinical data all in concert with each other,” he said.
Ferrin said that there is no one company that competes against Golden Helix head to head. “There are a very small number of competitors for a very small market,” he said. “If you look at the [software] market for genome-wide studies or association studies, there really is nobody else than us.”
[ pagebreak ]
One customer group Golden Helix is trying to attract comprises researchers who are newer to GWAS and who lack the bioinformatics skills of trained statisticians. As John Forsythe, director of marketing at the company, told BioArray News last week, the “market is still growing for genome-wide analysis because the cost of arrays is going down and the size is getting bigger, so there are more people coming into the field who have knowledge of genetics but need this kind of software for these large projects.
Peter Gregersen, a geneticist at the Feinstein Institute for Medical Research in Manhasset, NY, said that he has upgraded SVS 7 so that he can look at the large volumes of data produced at FIMR without the aid of bioinformaticists.
“We are producing somewhere around 3 million-4 million genotypes per week and there is a fairly high production of data,” Gregersen told BioArray News this week. “I have a staff of statistical geneticists who do the bulk of the analysis, but it is nice to look at the data, to be able to cut it, send it to people, cut it into small sections so that someone like myself can do the analysis,” he said.
Gregersen said that his staff uses all the “standard, free available software that is out there available for analysis,” but said that he personally uses the software because it “enables the non-expert statistical geneticists to look at the data and ask questions about it without having to learn every last little program.”
Andrew Hicks, a geneticist at the European Academy Bozen in Italy provided a similar assessment of SVS 7. “We have a full statistics department here that is used to handling things in different platforms,” Hicks said. “Where Golden Helix fits in for us is that our lab group is not qualified statisticians and are not trained to manipulate data sets, it allows us to handle the data for preliminary exploration.”
Hicks said he is unaware of competitive packages to SVS 7, though there are free tools available. According to Hicks, all software packages, including SVS, should be improved in the future to facilitate meta-analysis of data sets.
“I think there is a little bit of a disconnect between what a statistician can do scripting through an analysis package and what is available through these software packages,” Hicks said. “A lot of the packages should enable multiple centers doing meta-analysis of certain data. Another issue is we have a high degree of relatedness between samples, if that degree of relatedness rises above a certain threshold, then the genomic control, that method of correcting isn’t robust as it needs to be.”