Skip to main content
Premium Trial:

Request an Annual Quote

Procter & Gamble s Jay Tiesman, On Sources of Affy Array Variation



Senior scientist, in corporate research biotechnology division of Procter & Gamble, Miami Valley Labs,

Cincinnati, Ohio

At a Midwestern Affymetrix users group meeting last May, Procter & Gamble scientist Jay Tiesman made a presentation on sources of chip data variation that is still reverberating through the microarray community. Although Tiesman discussed a number of sources of variation in the chips, including target preparation and labeling, scanning order, and processing order, one source caught users’ eyes: lot-to-lot variation. This is because at the time Affymetrix claimed that its chips produced consistent data from lot to lot.

But when Tiesman and his colleagues at Procter & Gamble did a lot-to-lot comparison of data results for six arrays from two different lots using principal component analysis, they found an obvious difference between lots.

“Lots CAN make a difference,” Tiesman boldly emphasized on his PowerPoint slides. He recommended that users stagger lots across an experiment to counter this variability. Last week, BioArray News spoke briefly to Tiesman on the issue of chip-to-chip variability, data analysis, and microarray research in general.

QIn your talk “Experimental Designs to Minimize Data Variation in GeneChip experiments,” you discussed a number of sources of variation. While your comments on lot variation gained the most attention, what do you think is the biggest source of variation?

AThe biggest source was target labeling. Since it involves a number of relatively inefficient enzymatic steps, and in a series of them you multiply your inefficiencies, it makes it difficult to exactly replicate the data from one experiment to another.

I am also involved in an ILSI (International Life Sciences Institute) consortium, a group of people working on toxicogenomics. We have been doing experiments with Affymetrix chips, and people have been taking the same RNA samples labeled at different technical centers. When we put the data together, it clustered according to where it was labeled, not [other factors.]

QHow do you handle this source of variation?

AWe do a single large labeling step. We’ve moved to a 96-well plate format and are building a system that gives us a higher ability to do all of the experiments at one time. We also use master mixes of our solutions to try to maintain as high a standard of our experimental consistency as we can. We are in the process of validating [this system] and making it work as expected.

Also we can try to design the experiment so we don’t confound our data analysis. We don’t have treated sample labeled on one day, and the control on another day. Experimental design is something that no one spends a lot of time talking about. People think ‘we just throw the experiment together any way we want, and let the bioinformatics people figure it out afterwards.’ In our group, it has become critical to have these people involved in every aspect of experimental design.

QHave you considered changing the kind of labels you use?

AYes, another interesting way to get at this question would be to move from enzyme labeling to something more chemical, with higher efficiency. A company called Kreatech, which exhibited at GSAC, is working on more chemically based labeling technologies. The assumption is that the chemical lab technologies are really a lot more efficient than these enzymatic labeling technologies. We would need to run some experiments and see how that would act. But in the meantime, we have to plan to work out potential issues that may come to pass, and get into the details like shipping lot.

QHow important is this idea of variation between different lots in data analysis?

AIt comes down to how you look at the data. We really want to push the data as hard as we can. We’re looking at a whole lot more than lists of two-fold upregulation or downregulation of genes. If you are looking at data from a heuristic framework, you can get by with a lot more than if you start to use statistics. But when you move to higher-level statistical analysis the requirements are much more rigorous. We’ve talked to Affymetrix about this quite a bit. They were interested in our use of principal components analysis for quality control aspects as well. I think that overall we’ve pretty much shown them that on occasion chip lot can be an issue. There have also been instances where we have done replicates and have put different chips lots in, and weren’t able to see the difference.

The Scan

Genome Sequences Reveal Range Mutations in Induced Pluripotent Stem Cells

Researchers in Nature Genetics detect somatic mutation variation across iPSCs generated from blood or skin fibroblast cell sources, along with selection for BCOR gene mutations.

Researchers Reprogram Plant Roots With Synthetic Genetic Circuit Strategy

Root gene expression was altered with the help of genetic circuits built around a series of synthetic transcriptional regulators in the Nicotiana benthamiana plant in a Science paper.

Infectious Disease Tracking Study Compares Genome Sequencing Approaches

Researchers in BMC Genomics see advantages for capture-based Illumina sequencing and amplicon-based sequencing on the Nanopore instrument, depending on the situation or samples available.

LINE-1 Linked to Premature Aging Conditions

Researchers report in Science Translational Medicine that the accumulation of LINE-1 RNA contributes to premature aging conditions and that symptoms can be improved by targeting them.