Gene Expression Technical Guide

Table of Contents

Letter from the Editor
Index of Experts
Microarrays: Guilherme Rosa, Lisa White, and Arnoud van Vliet
qPCR: Anders Ståhlberg and Chang-Qi Zhu
RNA-seq: Jürg Bähler and Rui Chen
MicroRNA Expression Profiling: Pieter Mestdagh and Bin Shuai
List of Resources

Download the PDF version here

Letter from the Editor

When it comes to gene expression studies, it's all but impossible to say that Technique X is always superior to Technique Y. Thats because, at present, there's no clear winner among microarrays, quantitative PCR, and RNA sequencing, once you consider the pros and cons of each in the context of a given experiment. What microarrays lack in quantitative accuracy is made up for by the assay's throughput and relative ease-of-use? While qPCR is highly sensitive and specific, the technique's throughput isn't always up to par. Although RNA-seq allows for sensitive, specific, and high-throughput gene expression profiling, the nascent method has thus far seen a lack of standardization among its users.

Throughout this technical guide, our expert contributors promote multi-platform approaches as the gold standard for gene expression studies. Indeed, candidate identification via microarray analysis followed by qPCR is a tried and true method for the detection of differentially expressed genes. RNA-seq hits are also best confirmed via microarray analysis and/or qPCR.

As multiple platforms connote multiple protocols, confusion can quickly arise. How do I know what level of coverage to aim for when running RNA-seq? What measures should I take to ensure I adhere to any technique-specific guidelines? With seemingly endless options, what's my best bet for microRNA expression studies? The answers to all of these questions, and more, are spelled out in the pages that follow. As always, be sure to consult the resource page at the end of this guide for a list of publications and links to the bioinformatics tools our experts have referred to in their responses.

— Tracy Vence

Index of Experts

Many thanks to our experts for taking the time to contribute to this technical guide, which would not be possible without them.

Jürg Bähler
University College London

Rui Chen
Baylor College of Medicine

Pieter Mestdagh
Ghent University Hospital

Guilherme Rosa
University of Wisconsin-Madison

Bin Shuai
Wichita State University

Anders Ståhlberg
University of Gothenberg

Arnoud van Vliet
Institute of Food Research

Lisa White
Baylor College of Medicine

Chang-Qi Zhu
University of Toronto

Microarrays: Guilherme Rosa, Lisa White, and Arnoud van Vliet

Genome Technology: How many biological and/or technical replicates do you perform on any given sample? Why?

Guilherme Rosa: Like with any other experiment, determining an appropriate sample size is a very important step with microarray assays. Sample size calculation depends on the experiment aim(s) (e.g., detection of differential expression, classification of samples, gene network reconstruction, among others), in addition to other specifics of the research project, such as the nature of the biological material (e.g., complexity of the samples of interest), number of samples available, logistic constraints, et cetera. From a statistical viewpoint, the number of replications should be chosen such that an appropriate statistical power or efficiency of the experiment on accomplishing such purposes can be obtained. The statistical power of an experiment — this refers not only to microarray gene expression assays, but to any kind of experiment — is a function of the significance level adopted (or Type I error rate, i.e. the probability of rejecting a null hypothesis when it is actually true), the sample size, the effect size (i.e. the size of the effect in the population), and the test statistic. Hence, for a given significance level, test statistic, and effect size, the power (e.g., to detect differentially expressed genes) is increased with larger sample sizes. Nonetheless, sample size determination must also consider ethical, budgetary, and labor constraints.

More specifically, on the issue of biological and technical replications, there is a fundamental difference between them that should be taken into account when designing and analyzing a microarray experiment. This is indeed an important — but not always well-appreciated — subject in the microarray literature. Biological replication refers to the number of independent samples assayed, where a sample may be represented by a subject (animal or plant) or a pool of subjects (e.g., when each subject does not provide enough RNA for hybridization). Technical replication, on the other hand, refers to multiple RNA extractions from the same biological sample, or to multiple aliquots of the same RNA extraction, or even to multiple spots of each cDNA clone or oligonucleotide on the microarray slide. While biological replication is essential for drawing conclusions from the experiment, technical replication may improve precision and allow for testing differences within treatment groups. My own research interest refers to design and analysis of experiments aiming to understand the genetic control of gene activity. In such cases — especially with out-bred populations, such as the livestock species I work with — biological replication is generally preferred over technical replication.

GT: Which tools do you use to normalize your data and account for background noise? Why?

GR: After the image analysis of the arrays (which is generally performed using software provided by the manufacturer), the probe-level data undergo some pre-processing before a higher-level statistical analysis. The pre-processing generally involves data cleaning, background adjustment, normalization, summarization, and quality assessment. Each of these steps will depend on the microarray platform used (e.g., high-density oligonucleotide versus two-color spotted arrays), and should also be guided by careful data visualization and descriptive analysis. The goal of background adjustment is to correct the data for non-specific hybridization and noise in the optical detection system. However, it is always important to remember that there is a trade-off between variance and bias when performing background correction. While a background adjustment may improve the accuracy of expression measurements, it also inflates the measurement variance. Hence, the choice of background adjustment should depend on the goal of the experiment. For example, background adjustment could be essential if the goal is to estimate fold changes. However, it may be completely innecessary if the purpose of the experiment is the detection of differentially expressed genes, which will be validated using some other technique, such as RT-qPCR. In my own research, I've been using mostly the RMA [robust multichip average] and vsn [variance stabilization and calibration for microarray data] approaches for high-density oligonucleotide array data, and LOESS [robust locally weighted regression] (generally directly on total intensities, without background subtraction) for two-color data. A detailed description of various alternative methods and their implementation in R can be found, for example, in Gentleman et al. (2005).

GT: What steps do you take to adhere to the MIAME guidelines?

GR: As a bioinformatics/data analysis person, my main contribution on microarray experiments in terms of MIAME guidelines is a careful explanation of the experimental design, such as a description of the experimental units — biological replications, treatment factors and respective levels, randomization approach, et cetera — and the data analysis approach, such as data-processing protocols and higher level statistical analysis.

Genome Technology: How many biological and/or technical replicates do you perform on any given sample? Why?

Lisa White: I require that all investigators meet with us [Baylor College of Medicine Microarray Core facility] to discuss experimental design, user expectations, and requirements for gene-expression microarray processing. Depending on their experimental setup (i.e., time course, longitudinal study, cross-sectional study) and organism, we can generally come to a reasonable agreement on numbers of replicate samples. Almost as a rule, we do not perform technical replicates, but rather biological replicates (or a practical facsimile of biological replicates). Obviously the more replicates that can be run, the better; however, the lower cutoff can vary somewhat. Researchers using congenic/isogenic mouse models can usually get away with around three replicates, but with rat models I would recommend four or five replicates. We also get samples for pig, cow, and — of course — human. For these sample types, I recommend as many replicates as the investigator can afford, but no less than three.

GT: Which tools do you use to normalize your data and account for background noise? Why?

LW: Users in our facility can take two routes to data analysis. The first is requesting that the facility biostatistician/bioinformatician perform statistical analysis and downstream pathway analysis. In this case, we utilize the Bioconductor suite of analysis tools to normalize the arrays. We have had good success with RMA normalization for Affymetrix arrays. With spotted glass arrays, (e.g., Agilent and NimbleGen) we use Lowess normalization. The second option is for investigators to analyze the data themselves. Here, depending on the comfort and programming skills of the researcher, they will either follow our lead by analyzing the data using a Bioconductor package or commercial software for analysis. We've utilized Agilent's GeneSpring and Partek's Genome Suite software to perform some of the same analyses available in Bioconductor, which are more amenable to researchers with less-developed programming skills.

GT: What steps do you take to adhere to the MIAME guidelines?

LW: We've tried to incorporate the MIAME important data fields into our LIMS submission forms. Researchers are required to fill out information that follows MIAME guidelines when then enter a request for processing to our facility via our Web-based LIMS. Although initially there was some pushback, this process has been in place long enough that it's become routine for our users.

Genome Technology: How many biological and/or technical replicates do you perform on any given sample? Why?

Arnoud van Vliet: Our standard operating procedure is to do three independent (biological) replicates for each experiment, to ensure that any positive results are not due to variations in bacterial growth rate, variations in media, or the growth phase the cells are in. Technical replicates are rarely necessary, as our arrays already contain duplicate or triplicate features (each oligo is included two or three times and spread across the array to allow for intra-experiment quality control). Each gene on the array is represented by two or more different, non-overlapping oligonucleotides. This is possible because of the small size of the genomes we investigate (1.7 Mbp, about 1,700 features); this also allows us to ascribe greater confidence to the observed expression changes for individual genes.

GT: Which tools do you use to normalize your data and account for background noise? Why?

AV: We use different software programs for data acquisition and analysis. We mostly use GenePix (v6) to scan slides and initially correct features and gridding for Type I arrays (competitive hybridization). To correct for background noise, we subtract the median local background pixel intensity for each feature. We use Bluefuse software for Type II arrays (hybridization against a common standard). GeneSpring (v9 and v11) is useful for further comparative and statistical analyses. We primarily use t-tests (p less than 0.05) for statistical validation of Type I arrays along with the MArray and ArrayLeaRNA Excel macros, whereas we use a false discovery rate (FD R less than 0.05) for Type II arrays. We normally perform LOESS normalization before submitting to the Gene Expression Omnibus or ArrayExpress databases.

GT: What steps do you take to adhere to the MIAME guidelines?

AV: For MIAME compliance, all experimental procedures must be detailed and included in any submissions. Although image files are not required, the documentation is extensive and includes a description of the arrays used, the sample preparation, labeling, and hybridization protocols, and subsequent analysis methods. Normalized results are submitted for each gene.

qPCR: Anders Ståhlberg and Chang-Qi Zhu

Genome Technology: How do you design your probes and primers to obtain the most accurate readouts?

Anders Ståhlberg: Assay design is one of the most important steps for successful RTqPCR experiments. Careful design can reduce optimization and eliminate amplification of unwanted splice forms and other targets/sequences. For gene expression profiling, we are mainly using SYBR green-based assays, while we use probe-based assays for SNP detection, separation of almost-identical sequences, and if we plan to run multiple assays together. Sequence information is primarily obtained from the NCBI RefSeq database, but we also check for additional splice forms that occur in Ensemble. If we lack information about the importance of different splice forms, we initially try to design our assays to amplify all verified/predicted splice forms. Primer/probe design should follow general guidelines: primers should span exon-exon boundaries and/or span an intron to avoid genomic amplification. The amplification length should not be too long (less than 250bp for all detection chemistries), allowing for an efficient amplification. We mainly use Primer3/4 (including NCBI-Primer3 Blast) and Netprimer for our primer designs. The NCBI-Primer3 Blast algorithm is most useful to check the specificity of any primer design. If we have to reduce the stringency in our design, we usually skip the requirement that the primers are not allowed to amplify genomic DNA, since most samples can be treated with DNase. Furthermore, RT-negative samples will indicate if genomic DNA is a problem. Both dye- and probe-based assays are validated by gel electrophoresis. Some assays need to be re-evaluated if the sample source has changed. In some cases, the PCR product needs to be sequenced to verify that the correct target is amplified. The most important criteria in our assay design and optimization is specificity. Good assays should not generate any unspecific PCR products, including primer-dimers. Our single-cell assays are not allowed to generate any primerdimers even after 50 cycles of amplification.

GT: What measures do you take to avoid contamination?

AS: Avoiding contamination is crucial in our experiments since we perform single cell analyses on a regular basis, and are often trying to detect only a few target molecules. We perform pre-PCR, PCR, and post-PCR in three different rooms. In the pre-PCR room, we use a UV cabinet and no RNA/DNA templates are allowed there. Sample handling and RNA/DNA extraction are also performed in separate locations to minimize contamination. We perform the actual RTqPCR experiments in the PCR room. All post-PCR work is performed in a separate room; people who work here are not allowed to run further PCR experiments that same day, since one single PCR tube contains more than 1010 template molecules. We also store chemicals, samples, and PCR products separately to further minimize the risk of contamination; our work surfaces are regularly decontaminated. To detect contamination, we always include negative controls for all RT-qPCR assays. If possible, we also include negative controls earlier in the workflow.

GT: What is your protocol for replicates? Why?

AS: The nature of the experiment determines the number of replicates we use. We have reported that experimental variability is mainly occurring in the RT step — or earlier, during the sample processing (sampling and purification) — with the exception of samples with very few transcripts. Most research studies address a biological or medical problem; consequently, the number of unique samples should be as high as possible. In most such studies, we run RT-qPCR as single reactions (one purification, one RT, and one qPCR). For some analyses, such as diagnostic tests, each sample must be analyzed with the highest possible accuracy. Here, replicates should be included in the sample processing as early on as possible. Usually the nature of the sample determines when it is possible to split the sample into replicates. Another factor to consider is cost: it is easiest and cheapest to run qPCR replicates, but this is the replicate that improves the end result least. For all qPCR experiments, we include positive and negative controls using two or more replicates for both.

GT: What steps do you take to adhere to the MIQE guidelines?

AS: The MIQE guidelines serve as a fundamental base when we design, validate, measure, analyze, and report RT-qPCR data. It is a valuable reference for checking the parameters of importance in your experiments. Not all of the recommendations are always applicable to our studies, but most are. The guidelines are also very useful when inexperienced people are being trained in RT-qPCR.

Genome Technology: How do you design your probes and primers to obtain the most accurate readouts?

Chang-Qi Zhu: We use ABI's PrimerExpress to design our primers. The primer sequences are Blasted against human and mouse genomic sequences, to get information whether the primers are human- or mouse-specific.

GT: What is your protocol for replicates? Why?

CZ: We use triplicate when doing PCR. Gene expression level is the average of the closest duplicates; we do not use data from the third replicate.

GT: What steps do you take to adhere to the MIQE guidelines?

CZ: We list the target gene, PCR optimization, primer sequences that were used to do the quantification, PCR machine and PCR cycling, and quantification method, whether it is done by ΔΔCt or standard curve.

RNA-seq: Jürg Bähler and Rui Chen

Genome Technology: How do you determine the level of coverage to aim for when designing your experiment?

Jürg Bähler: That's an important issue, and we are currently performing some analyses in this respect. We do not have any solid rules yet. For most analyses, we simply use one Illumina lane/fission yeast transcriptome sample, which provides around 60-fold coverage and is of sufficient depth to detect most transcripts with reasonable read numbers. For other analyses, we use SOLiD runs with 32-fold multiplexing, which provide about 20x coverage — which seems appropriate for applications that do not require the detection of low abundance transcripts.

GT: What are some of the measures you take to confirm your "hits"?

JB: We compare with previous microarray and Affymetrix chip data, and we also verify selected genes by RT-PCR.

GT: Which tools do you use for your downstream analyses? Why?

JB: Currently, we use customized Perl and R scripts, as this approach is quick and … can be tailored to our specific needs. For alignment, we mostly use Exonerate, which is fast and serves us well. Somebody in the group is now analyzing and comparing six RNA-seq analysis packages, but no conclusions are yet available.

Genome Technology: How do you determine the level of coverage to aim for when designing your experiment?

Rui Chen: The level of proper sequencing coverage depends on the objective of the experiment. For example, if the goal is to obtain gene expression level, relatively low coverage is needed. In general, we find that less than 10 million reads are sufficient to give a very accurate expression profile. In contrast, if the goal is to identify splicing isoforms for genes that are expressed at a medium or relatively low level, much higher sequencing coverage and longer read lengths are required. To help determine whether the sufficient coverage is reached, it is also useful to plot out a saturation curve using different levels-of-coverage data.

GT: What are some of the measures you take to confirm your "hits"?

RC: In our experience, RNA-seq is quite accurate in measuring gene expression level and has good concordance with other methods, such as RT-qPCR. So for measuring gene expression level, it is only essential to validate several genes to make sure the experimental procedure is sound. I think it is more important to perform experiment replicates to confirm the finding. Technical replicates will also be useful, if desired. To confirm findings, we typically use RT-qPCR for gene expression level and junction RT-PCR and sequencing for novel splicing events.

GT: Which tools do you use for your downstream analyses? Why?

RC: For small genomes, such as Drosophila, we use an in-house pipeline based on the BLAT alignment algorithm (Blast-like Alignment Tool). We have found that 5 to 6 percent of our RNA-seq reads (75bp length) contain splice-junction events. BLAT supports intron mapping and has been essential for identifying reads that cross splicing junctions. In organisms with larger genomes, such as mouse, we are using the Bowtie, Tophat, and Cufflinks software packages. Due to the computational resources required for alignment of reads to larger genomes, software designed specifically for short-read alignment is necessary. The Cufflinks package provides executables for performing analysis of alternative splicing and differential expression. Both of these approaches calculate RPKM [reads per kilobase of exon model per million mapped reads] and identify alternative splicing events from RNA-seq reads. Downstream, we perform additional statistical analyses using in-house scripts written in the R programming language.

MicroRNA Expression Profiling: Pieter Mestdagh and Bin Shuai

Genome Technology: Which tools do you use for RNA extraction?

Pieter Mestdagh: The approach we take for total RNA extraction depends on the amount of available starting material. In case we want to isolate total RNA from 100 to 10,000 cells (e.g., cells grown in 96-well plates), we typically apply an in-well lysis and perform reverse transcription directly on the crude cell lysate. This significantly reduces hands-on time and maximizes RNA yield, since no column-based purification is required. When starting with higher cell numbers or tissues, we lyse the cells using a phenol/guanidine-based lysis reagent and isolate the total RNA fraction after phase separation using chloroform. The RNA is then further purified on a spin column. In any case, one should be aware that not all RNA isolation kits co-purify the small RNA fraction, so one should carefully read the product manual before deciding which kit to use.

GT: In the event of a contamination, how do you go about removing DNA — or other contaminants — from a sample?

PM: We tend to prevent contamination rather than having to remove it. First, we apply a strict separation between pre-PCR and post-PCR or DNA labs. Secondly, all PCR reactions are prepared in semi-closed cabinets that are thoroughly cleaned with UV overnight and with ethanol before use. The necessity to remove contaminating genomic DNA from the RNA sample depends on the downstream application: for miRNA expression profiling, we use the stem-loop RTqPCR method, which does not require DNA removal.

GT: Which approach do you prefer for detecting miRNA expression, and why?

PM: qPCR is definitely the preferred method for miRNA expression analysis, in terms of speed, sensitivity, specificity, and its large dynamic range of quantification. More specifically, we use the stem-loop RT-qPCR technology in combination with TaqMan detection for miRNA expression profiling. In the event that only a limited amount of RNA is available, a limited cycle pre-amplification step is introduced before starting the actual qPCR reaction; this enables us to significantly increase the sensitivity of miRNA detection, even down to the single cell level. Importantly, the preamplification procedure is highly robust and reproducible and does not interfere with the interpretation of the results. Finally, TaqMan-based miRNA detection provides the specificity needed to distinguish miRNAs that belong to the same family and that differ by only one or two nucleotides, such as the let-7 family members. We prefer this approach because it allows us to use minimal amounts of input RNA, does not require a DNase treatment of the RNA sample, and provides reproducible results. Of note, miRNA expression profiling can only be successful if a robust detection platform is complemented with a robust normalization of the expression data. To this end, we developed a novel and powerful normalization procedure that is based on the mean expression of all expressed miRNAs per sample. This normalization strategy has been implemented in the qbasePLUS software package from Biogazelle, which we use to aid miRNA expression data analysis.

Genome Technology: Which tools do you use for RNA extraction?

Bin Shuai: We extract total RNAs from fresh or frozen samples using TRIzol reagent. Depending on the downstream application, the RNA samples are further purified or used as is. For example, to prepare RNAs for miRNA arrays, we use a modified protocol with Qiagen RNeasy column. We also use Ambion's flashPAGE Fractionator to enrich small RNAs for some experiments. However, this approach requires more RNAs, and fractionated samples need to be checked to ensure that they contain RNAs of correct size.

GT: In the event of a contamination, how do you go about removing DNA — or other contaminants — from a sample?

BS: We always treat our RNA samples with DNase as a routine procedure for RNA purification. We use a pair of primers that span intron(s) of a housekeeping gene to check the quality of the RNA before moving on to the next step. For other types of contaminants, we use phenol chloroform extraction followed by ethanol precipitation. However, with good practice, RNAs extracted using TRIzol will not need to be purified with phenol again.

GT: Which approach do you prefer for detecting miRNA expression, and why?

BS: That depends. If we just want to check whether the miRNA is expressed or not, we simply use RT-PCR. It requires only a small amount of RNA, it's sensitive, and costs less. There are kits available for RTPCR f miRNAs, but one can also purchase reagents from different vendors and make their own primers. If we want to compare the expression of miRNAs among different samples, we use RT-qPCR. In this case, 5S rRNA is used as an endogenous control for comparative study. For RTqPCR, we usually go with a TaqMan-based assay, since it is gene-specific and easy to optimize. For miRNA profiling, we have used miRNA array. However, a comparison between miRNA array and RT-qPCR has indicated that the latter is more sensitive and reproducible. The cost of running a miRNA array can be high, and its results have to be verified by RT-qPCR.

List of Resources

If you need more information, here are additional sources that may help you answer your gene expression questions.

Publications

Bähler J. (2009). Global approaches to study gene regulation. Methods. 48(3): 217.

Bengtsson M, Hemberg M, Rorsman P, Ståhlberg A. (2008). Quantification of mRNA in single cells and modeling of RT-qPCR induced noise. BMC Molecular Biology. 9: 63.

Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoecker C, Aach J, Ansorge W, Ball CA, Causton HC, Gaasterland T, Glenisson P, Holstege FCP, Kim IF, Markowitz V, Matese JC, Parkinson H, Robinson A, Sarkans U, Schulze-Kremer S, Stewart J, Taylor R, Yilo J, Vingron M. (2001). Minimum information about a microarray experiment (MIAME) — toward standards for microarray data. Nature Genetics. 29: 365-371.

Bueno Filho JS, Gilmour SG, Rosa GJ. (2006). Design of microarray experiments for genetical genomics studies. Genetics. 174(2): 945-957.

Bustin SA, Benes V, Garson JA, Helleman J, Huggett J, Kubista M, Mueller R, Nolan T, Pfaffl MW, Shipley GL, Vandesompele J, Wittwer CT. (2009). The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments. Clinical Chemistry. 55: 611-622.

Chambers C, Shuai B. (2009). Profiling microRNA expression in Arabidopsis pollen using microRNA array and real-time PCR. BMC Plant Biology. 10(9): 87.

Chang KH, Mestdagh P, Vandesompele J, Kerin MJ, Miller N. (2010). MicroRNA expression profiling to identify and validate reference genes for relative quantification in colorectal cancer. BMC Cancer. 10: 173.

Churchill GA. (2002). Fundamentals of experimental design for cDNA microarrays. Nature Genetics. 32: 490-495.

Croucher NJ, Fookes MC, Perkins TT, Turner DJ, Marguerat SB, Keane T, Quail MA, He M, Assefa S, Bähler J, Kingsley RA, Parkhill J, Bentley SD, Dougan G, Thomson NR. (2009). A simple method for directional transcriptome sequencing using Illumina technology. Nucleic Acids Research. 22: e148.

Daines B, Wang H, Li Y, Han Y, Gibbs R, Chen R. (2009). High-throughput multiplex sequencing to discover copy number variants in Drosophila. Genetics. 182: 935-941.

Gentleman R, Carey VJ, Huber W, Irizarry RA and Dudoit S (Eds.). (2005). Bioinformatics and computational biology solutions using R and Bioconductor. Springer.

Jansen RC and Nap J. (2001). Genetical genomics: the added value from segregation. Trends in Genetics. 17: 388–391.

Lackner DH, Bähler J. (2008). Translational control of gene expression from transcripts to transcriptomes. International Review of Cell and Molecular Biology. 271: 199-251.

Lind K, Ståhlberg A, Zoric N, Kubista M. (2006). Combining sequence-specific probes and DNA binding dyes in real-time PCR for specific nucleic acid quantification and melting curve analysis. BioTechniques. 40(3): 315-319.

Marguerat S, Bähler J. (2009). RNA-seq: from technology to biology. Cellular and Molecular Life Sciences. 67(4): 569-579.

Mestdagh P, Feys T, Bernard N, Guenther S, Chen c, Speleman F, Vandesompele J. (2008). High-throughput stem-loop RT-qPCR miRNA expression profiling using minute amounts of input RNA. Nucleic Acids Research. 36 (21): e143.

Mestdagh P, Van Vlierberghe P, De Weer A, Muth D, Westermann F, Speleman F, Vandesompele J. (2009). A novel and universal method for microRNA RT-qPCR data normalization. Genome Biology. 10: R64.

Parikh A, Miranda ER, Katoh-Kurasawa M, Fuller D, Rot G, Zagar L, Curk T, Sucgang R, Chen R, Zupan B, Loomis WF, Kuspa A, Shaulsky G. (2010). Conserved developmental transcriptomes in evolutionarily divergent species. Genome Biology. 11(3): R35.

Rosa GJM, de Leon N, Rosa AJM. (2006). A review of microarray experimental design strategies for genetical genomics studies. Physiological Genomics. 28: 15-23.

Rosa GJM, Steibel JP, Tempelman RJ. (2005). Reassessing design and analysis of two-color microarray experiments using mixed effects models. Comparative and Functional Genomics. 6: 123-131.

Srivatsan A, Han Y, Peng J, Tehranchi AK, Gibbs R, Wang JD, Chen R. (2008). High-precision, whole-genome sequencing of laboratory strains facilitates genetic studies. PLoS Genetics. 4(8): e1000139.

Ståhlberg A, Bengtsson M. (2010). Single-cell gene expression profiling using reverse transcription quantitative real-time PCR. Methods. 50(4): 282-288.

Ståhlberg A, Elbing K, Andrade-Garda JM, Sjögreen B, Forootan A, Kubista M. (2008). Multiway real-time PCR gene expression profiling in yeast Saccharomyces cerevisiae reveals altered transcriptional response of ADH-genes to glucose stimuli. BMC Genomics. 16(9): 170.

Tichopad A, Kitchen R, Riedmaier I, Becker C, Ståhlberg A, Kubista M. (2009). Design and optimization of reverse-transcription quantitative PCR experiments. Clinical Chemistry. 55: 1816-1823.

Wilhelm BT, Marguerat S, Goodhead I, Bähler J. (2010). Defining transcribed regions using RNA-seq. Nature Protocols. 5(2): 255-266.

Van Vlierberghe P, De Weer A, Mestdagh P, Feys T, De Preter K, De Paepe P, Lambein K, Vandesompele J, Van Roy N, Verhasselt B, Poppe B, Speleman F. (2009). Comparison of miRNA profiles of microdissected Hodgkin/Reed-Sternberg cells and Hodgkin cell lines versus CD77+ B-cells reveals a distinct subset of differentially expressed miRNAs. British Journal of Haematology. 147(5): 686-690.

Van den Veyver IB, Patel A, Shaw CA, Pursley AN, Kang SH, Simovich MJ, Ward PA, Darilek S, Johnson A, Neill SE, Bi W, White LD, Eng CM, Lupski JR, Cheung SW, Beaudet AL. (2009). Clinical use of array comparative genomic hybridization (aCGH) for prenatal diagnosis in 300 cases . Prenatal Diagnosis. 29(1): 29-39.

Xiao W, Liu W, Li Z, Liang D, Li L, White LD, Fox DA, Overbeek PA, Chen Q. (2006). Gene expression profiling in embryonic mouse lenses. Molecular Vision. 26(12): 1692-1698.

Yu W, Ballif BC, Kashork CD, Heilstedt HA, Howard LA, Cai WW, White LD, Liu W, Beaudet AL, Bejjani BA, Shaw CA, Shaffer LG. (2003). Development of a comparative genomic hybridization microarray and demonstration of its utility with 25 well-characterized 1p36 deletions. Human Molecular Genetics. 12(17): 2145-2152.

Chari R, Lonergan KM, Pikor LA, Coe BP, Zhu CQ, Chan TH, MacAulay CE, Tsao MS, Lam S, Ng RT, Lam WL. (2010). A sequence-based approach to identify reference genes for gene expression analysis. BMC Medical Genomics. 3(3): 32.

Websites

NBCI RefSeq
http://www.ncbi.nlm.nih.gov/refseq

NCBI Primer-Blast
http://www.ncbi.nlm.nih.gov/tools/primer-blast/

Bioconductor: Open Source Software for Bioinformatics
http://bioconductor.org/

Functional Genomics Data Society MIAME Guideline Information
http://www.mged.org/Workgroups/MIAME/miame.html

Gene Quantification MIQE Guideline Information
http://www.gene-quantification.de/miqe.html

Netprimer
http://www.premierbiosoft.com/netprimer/index.html

Exonerate
http://www.ebi.ac.uk/~guy/exonerate/