Affymetrix Quality control

E-mail Print PDF


Generating high quality microarray data requires vigorous quality control measures at each individual step of the process, starting with the experimental design of the study, the generation of samples, extraction of RNA, labeling of the probe, and microarray hybridization.

To minimize experimental variability, the same dedicated Research Assistants should perform all microarray studies, the same Affymetrix setup is used for hybridization, washing, and scanning, and microarrays of the same production lot are being used in comparative studies. For the following tutorial R and library affyQC was used to demonstrate the steps of Quality Control over Affymetrix chips.



RNA Quality Control

RNA is isolated using Trizol® according to the manufacturer's protocol and purified by phenol/chloroform extractions or RNAeasy columns®. Protocol use is dependent upon the available quantity of RNA from the extraction. However, the same protocol is used throughout an entire experiment.
RNA purity and yield are determined by optical density (OD) measurements at wavelengths of 260 and 280nm. The OD 260/280 ratio should be close to 2.0. Otherwise, the RNA will be re-purified.
Further evaluation of the RNA quality is done using the Agilent Bioanalyzer and Lab-on-a-Chip (Figure 1). Electropherograms are created that detect degradation (Figure 2) and measure the ribosomal 5S, 18S, and 28S bands (Figure 3). Ideally, the ratio of 28S/18S bands should be close to 2, but we do accept samples that show clear 18S and 28S peaks. RNA samples with a visible degree of degradation are not further processed.

Array Hybridization Quality Control
A general visual inspection of the entire chip, after scanning, is performed. There should be no white speckling, holes, smudges, areas of saturation or uneven hybridization on the chip.
Internal and external spiked in controls should maintain a 1:2 ratio between the 5’ and 3’probe sets. The internal controls are GAPDH and beta actin. The external controls are BioB, BioC, and CreX. The external controls should also increase in quantity, the latter being highest.
The measure of background noise (RawQ) should remain consistent across the experiment, meaning within ± 3 points of the median.
The scaling factor (SF) should remain consistent across the experiment. The scaling factor for each given experiment should be within a 2-3 fold range.

Statistical Quality Control
Histogram and box plot analysis of *.cel intensities are performed. Histograms are a good visualization tool for identifying saturation, which is seen as an additional peak at the highest log intensity in the plot. With a PMT setting of 100%, probe intensities reached saturation more frequently (See Figure 4 for arrays displaying saturation). Saturated probes are excluded from further analysis. Box plots are another good visualization tool for analyzing the overall intensities of all probes across the array. The box is drawn from the 25th and 75th percentiles in the distribution of intensities. The median, or 50th percentile, is drawn inside the box. The whiskers (lines extending from the box) describe the spread of the data. Arrays must be similar in range, or are otherwise discarded (See Figure 5 for an acceptable box plot).

The Association of Biomolecular Resource Facility (ABRF) conducted a multicenter study in 2002 to identify factors that contribute to variability in oligonucleotide microarray results. This retrospective study used data from 835 MG-U74A and HG-U95A Affymetrix arrays that were previously generated in the microarray core facilities of the members of the Microarray Research Group (MARG) (Knudtson et. al; Factors contributing to variability in DNA microarray results: the ABRF Microarray Research Group 2002 study. J. Biomol. Tech. 2002; 108).

The results of this study indicate that:
• Lab-to-lab variation accounted for the greatest source of error in this Affymetrix study. This suggests that .CHP data generated by different institutions may not be easily compared without further normalization in comparative analyses.
• The observed variance in the signals for the exogenous control spikes suggest that the controls may not be an adequate tool to normalize data for comparison analysis. It had been previously assumed that these values should be independent of sample and array type.
• Biological reproducibility should be demonstrated by repeating each experiment a minimum of 3 or 4 times with different extracts of the same type.
• Systematic, reproducible errors can be minimized by applying various algorithms which serves to improve the average reproducibility from ~77% to ~93%.
However, the caveat is that one should not try to rescue a poor hybridization result with mathematical manipulations!






• GAPDH 30 : 50 values are plotted as circles. According to Affymetrix they should be about 1. GAPDH values that are considered potential outlier (ratio > 1.25) are coloured red, otherwise they are blue.

• β-actin, 30 : 50 ratios are plotted as triangles. Because this is a longer gene, the recommendation is for the 30 : 50 ratios to be below 3; values below 3 are coloured blue, those above, red.

• The blue stripe in the image represents the range where scale factors are within 3-fold of the mean for all chips. Scale factors are plotted as a line from the centre line of the image. A line to the left corresponds to a down-scaling, to the right, to an up-scaling. If any scale factors fall outside this ´ S3-fold regionˇ S, they are all coloured red, otherwise they are blue.

• % present and average background, are listed to left of the figure. plot is generated with the following command.






This tutorial was based on the affyQCReport manual of Craig Parman and Conrad Hallin. The affyQCReport is part of the R-Bioconductor project

You are here: Tutorials Microarray tech Quality Control Affymetrix Quality control