Microarray giant Affymetrix and software startup Biotique Systems are linking up to allow anyone so inclined to take a visual journey into previously unexplored places on the human genome.
Affymetrix has tapped Biotique Systems to build a web-based front end for a massive repository of transcription data it is collecting in a research project funded by the National Cancer Institute. The company announced the relationship – first reported by BioInform’s sister publication, BioArray News, on Aug. 30 – late last week.
The collaboration will be funded through a $4 million NCI grant Affymetrix announced in July.
This is a big deal for a startup funded out of the pockets of its founders, former DoubleTwist executives John Burke and Stephen Sanders. “Yes,” Burke told BioInform, “this puts us on the map.”
“They could put in place the right tools,” Thomas Gingeras, Affymetrix vice president of biological sciences, said. “We thought it would be useful to have a visualization tool and to allow people to make some queries of the data.” The company already had a program in place, in cooperation with the University of California, Santa Cruz, for visualizing the data but Biotique’s software – off-the-shelf – provided the enhancements that Affymetrix was seeking.
Thus, this agreement between the giant and the startup was put in place so quickly that late last week, corporate lawyers were still working out details.
Biotique System’s “right tool,” based on its BLIS [Biotique Local Integration System] platform, enables secure data integration, analysis, and visualization. These tools will allow the public to interact with large amounts of data, yet still preserve and protect Affymetrix’s proprietary information.
For the Affy portal, Biotique added to BLIS a customized plug-in module, the Biotique Expression Analysis Module. Online, viewers will click to enter a beta version of Affy’s Human Transcriptome Server.
Behind the scenes, users will interact with Biotique’s software platform, which runs on Linux or Unix and integrates — and continuously updates — human genome data to provide a reference template. This application will present a genomic viewer that will allow users to view expressions and to query the data.
Biotique says some 70 percent of the functions of its BLIS product will not be available in this implementation. These include batch management and processing, user accounts, and comparative genomics support. Since the demo server is anonymous and does not require login with a user name and password, all functions allowing the user to add data have been turned off.
The site will allow users to query NCI/Affymetrix transcriptome maps alongside public geno- mic annotations, Affymetrix chip designs, and clones from the Mammalian Gene Collection.
Users will be able to browse an area of the genome, click on any chromosome in the map, and then zoom in on that area, down to the single-base level, and view transcriptome data and genomic annotations aligned graphically on the genomic region.
For Affymetrix, the project, from first gathering and publishing the data to now making it publicly available, will influence the tools that it produces. The company will, within a year, incorporate in its chips probes that interrogate from near the 5’ end of the gene in addition to its probes that today work preferentially near the 3’ end, said Gingeras.
Internally, the project will utilize abilities acquired in its $76 million purchase of computational genomics firm Neomorphic in October 2000.
Simon Cawley, the company’sbioinformatics scientific lead, who came to Affymetrix from Neomorphic, said the project has been technologically challenging. “You can’t just throw up flat files and say, ’Here, go play with it,’” he said. The Affy bioinformatics team has set up a pipeline for processing, storing, and retrieving large amounts of data, and supplied a terabyte of storage. The developers are also setting up clusters of databases to distribute the large volumes of data expected to come out of the project.
Initially, the data offered will consist of transcript data for chromosomes 21 and 22 and will be available at www.affymetrix.com/transcriptome/index.affx or www.biotiquesystems.com. The data, which today is stored in the equivalent of four CD-ROMs, results from research published in the May 3, 2002 issue of Science.
Affymetrix and NCI are using special arrays that contain 25-mer probes spaced, on average, every 35 base pairs along the chromosomes in the genome to conduct this research, which has revealed that large regions of the genome – beyond the coding segments – are transcribed. The second and ongoing phase of this effort will involve researching nine chromosomes and finishing maps within a year, generating yet more data. Because this is NCI-funded, the data will be publicly available. The goal of the project is to build whole-genome arrays that interrogate the genome at resolutions approaching the single-nucleotide level.