Golden Helix last week launched a sequence-analysis module for its flagship SNP & Variation Suite software.
While the firm's customers will be able to use the module to conduct tertiary analysis of next-generation DNA sequencing data, the tool could also benefit array users, particularly those using high-density chips in association studies, according to the firm.
"The format for sequence data is essentially identical to the format for [genome-wide association studies] data, and most of the tools for rare-variant analysis and annotation-based filtering may be used for both SNP array data and next-generation sequencing variant calls," said Bryce Christensen, a statistical geneticist at the Bozeman, Mont.-based company.
Christensen noted that the "newest generation of SNP arrays, such as Illumina’s HumanOmni2.5 product, contain extensive rare content identified by the 1000 Genomes Project," and said that the firm's "new tools for rare variant analysis will be especially helpful to researchers using these platforms."
Golden Helix's new module is being introduced with the latest upgrade of its SVS software, version 7.4. The new offering is being packaged primarily for use with next-generation sequencing systems. CEO Christophe Lambert said in a statement that the firm's new product will allow its customers to "capitalize on all that next-gen sequencing has to offer."
The firm touts the ability of SVS 7.4 to import, manage, and manipulate millions of variants from small numbers to thousands of samples; sort through millions of variants, filtering out those that are common, benign, poorly covered, or irrelevant; find genes or regions with an abundance of variants in a given sample set; assess rare-variant burden using collapsing and association methods; and understand the contribution of rare variants using functional prediction.
While the tool is advertised as "compatible with variant files from most major sequencing platforms," the new functionalities are applicable to array-based projects. Christensen cited the data-filtering capabilities in SVS 7.4 as lending themselves not only to association studies, but to the creation of custom genotyping arrays.
In particular, he mentioned the software's "collapsing methods." According to Christensen, traditional association techniques used in GWAS studies do not have the power to detect the significance of rare variants individually or provide tools for measuring their compound effect. To do this, it is necessary to "collapse" several variants into a single covariate based on regions such as genes.
Using SVS 7.4, researchers can use several published approaches to study rare variants. The Cohort Allelic Sum Test, for example, allows users to count and compare the number of individuals with one or more mutations in a region or gene between affected and unaffected groups.
Meantime, the Combined Multivariate and Collapsing feature bins variants in certain regions according to a criterion such as minor allele frequency, collapses the variants within each bin, and performs multivariate testing on the counts across the various bins.
"Conventional statistical methods used in GWAS lack the necessary power to detect associations with rare variants," Christensen said. "The collapsing methods incorporated in SVS 7.4 provide additional power that may be helpful in that regard."
Other components of Golden Helix's software upgrade include an enhanced genome-annotation track-management system, which supports interactive data analysis and visualization and allows GWAS results to be compared side-by-side with annotation data.
Both the collapsing methods and the track-management system could be useful for customers that would like to filter their results, from genome-wide arrays or next-gen sequencing, to select a set of SNPs for custom genotyping on arrays.
According to Christensen, SVS users can use the annotation tracks as a reference to filter SNP and sequence data.
[ pagebreak ]
"For example, you can identify the SNPs that are located within exons, outside of known CNV regions, [that were] were not included in dbSNP129, or had a frequency less than a specified threshold in the 1000 Genomes populations, all with just a few mouse clicks," he said. "Any data points that can be associated with a physical position in the genome may be used to filter SNPs and create interesting subsets of the data."
Additionally, the new version of SVS offers tools for pairwise identity-by-descent estimation, linkage-disequilibrium pruning, and estimating inbreeding coefficients.
"Researchers who are using sequence data, or any SNP data, for that matter, to construct custom genotyping arrays will find SVS 7.4 very helpful," said Christensen. "The annotation-filtering tools will help them to identify novel or otherwise interesting content in their sequence data … and the LD-pruning tool will help them reduce the number of redundant SNPs, if necessary, before creating the final list of SNPs for their custom array."
According to Christensen, other features contained in Golden Helix's new software include the ability to perform SNP quality-control procedures recommended in a paper on GWAS quality that was published in Genetic Epidemiology last year by a group led by researchers at the University of Washington.
For copy-number variant analysis, the company has also added a tool enabling researchers to detect and correct wave effects. Christensen said Golden Helix has also made "substantial improvements" to the process for importing Affymetrix CEL files for CNV analysis.
While Golden Helix claims to have no commercial competitors, plenty of software firms have introduced tools for sequence analysis in recent months. As BioArray News sister publication BioInform noted, Spanish bioinformatics firm Integromics rolled out a tertiary NGS sequence-analysis tool in October, the same month that Genedata launched the Genedata Expressionist Refiner Genome module to process and analyze NGS data in addition to the mass spectrometry, genomic profiling, and microarray modules in its Expressionist analysis platform.
Golden Helix also competes against El Segundo, Calif.-based BioDiscovery, which in November 2010 launched version 5.1 of its Nexus Copy Number software that also supports array and sequencing data analysis (BAN 11/16/2010).
Have topics you'd like to see covered in BioArray News? Contact the editor at jpetrone [at] genomeweb [.] com.