AT A GLANCE
Research assistant professor, Virginia Bioinformatics Institute (VBI)
PhD in molecular genetics at the University of Texas, M.D. Anderson Cancer Center
Developed microarray technologies for Baylor College of Medicine and for Allan Bradley, director of the Sanger Center.
Recently, the Virginia Bioinformatics Institute (VBI), and Johns Hopkins’ Bloomberg School of Public Health began a five-year, $10 million collaborative bioinformatics project to study human infectious diseases such as malaria, tuberculosis, and HIV. Can you tell me a little bit about how microarrays will fit into the new collaborative project?
Microarrays will form the foundation for our studies. Of the total, the proportion earmarked for research is $285,000 for the first year, and about $177,500 for successive years.
In the first year of the study, we will design a set of primers that can be used to make a genome-wide microarray. Oligos will be created using Primer3 software, along with homology-searching algorithms. These products will then be tested using VBI’s real-time PCR platforms prior to array construction to ensure our predictions were correct. It’s the same basic approach that has been used to create microarrays for microbes such as E. coli, Helicobacter pylori, and Caulobacter crescentus.
I understand you are going to develop microarrays to assess virulence factors of certain pathogens. Which pathogens are you going to select and why?
We are working closely with Johns Hopkins to produce microarrays to study malaria and tuberculosis. Pathogens have been selected based on their human health impact and modes of pathogenesis. For instance there are an estimated 300 to 500 million malaria cases each year, and increasing drug resistance is a major obstacle to effective treatment for both malaria and tuberculosis. Vaccines would clearly be a better control mechanism if we could understand how to make effective ones.
What sorts of arraying methods and equipment are you going to use to spot your arrays?
We use an epoxysilane attachment chemistry and spot the arrays with a GeneMachines Omnigrid using Telechem Stealth pins. For signal detection, we use a [Packard] ScanArray 5000XL. To store and analyze the expression data generated by our experiments, we will use GeneX, an open source gene expression database and analysis toolkit developed under the direction of Jennifer Weller, a scientist currently at VBI.
What are the major obstacles and challenges to using arrays?
As adequate controls and replicates for microarray data are fundamental requirements for producing quality results, more than 60 percent of our time is devoted to ensuring that adequate QC measures are put in place during the arraying process. Another significant challenge in dealing with arrays is managing the large quantities of data that result from experiments. There is currently an upload tool that makes it easy to insert Affymetrix data into the GeneX database, and a similar tool is under development to upload data from custom microarrays.
I have heard that microarray analysis presents some thorny problems. How do you analyze array data? What sorts of data mining techniques do you use?
Microarray data often contains considerable amounts of variation, stemming from both experimental and technical sources. The GeneX system includes a clustering procedure that takes this variation into account and assigns a confidence metric to all resulting gene groups. Implemented by Karen Schlauch at VBI, this tool supplies the user with a measure of "how sure" they can be that their clustering results are meaningful.
Also available within GeneX is a series of statistical tests that determine, at a specified significance level, whether genes are differentially expressed. Together with an ANOVA procedure to detect variation of specific effects, we believe that these tools will provide methods of generating meaningful hypotheses of the activity, interactions, and relevance of genes under infection.
What do you hope to gain from the array research?
We hope to develop an oligo-based microarray that can be printed in virtually unlimited quantities and shared by any faculty member interested in the physiology of the organism. We also are seeking to develop full-length expression competent constructs to assess the functional roles of genes critical in host-vector and host-pathogen infection processes; a vector system to generate site-directed or systematic loss-of-function mutations in host, vector, or pathogen genomes; and interesting and meaningful hypotheses of gene activity under infection conditions.