The Reproducibility Probability Score (RPS) for a gene is defined as the probability that this gene is selected as being differentially expressed from the data generated by a typical laboratory. A typical laboratory is either the original laboratory that generated the microarray data or a hypothetical laboratory that prudently follows the same protocol to study the same biological materials as the original laboratory.
The RPS program uses simulation to generate data for the hypothetical laboratories in the computation of the RPS. Two sets of data are used in the simulation. The first set is the data generated from the original laboratory, also referred to as the new data. The second set is a reference data set. Currently, the RPS program uses the data from the Shi et al. as the reference data set, and it provides an option for the use of other reference data. The reference data and the new data have to be generated on the same microarray platform, and the reference data have to be generated by more than one laboratory. However, the reference data do not have to be generated by the same laboratory or originate from the same biological samples as the new data.
Currently, the RPS program is capable of analyzing data from five commonly used microarray platforms—the Human Genome Survey Microarray v2.0 (Applied Biosystems, Foster City, CA, USA), the HG-U133 Plus 2.0 GeneChip (Affymetrix, Santa Clara, CA, USA), the Whole Human Genome Oligo Microarray G4112A (Agilent, Palo Alto, CA, USA), the CodeLink Human Whole Genome (GE Healthcare, Chalfont St. Giles, UK) and the Human-6 BeadChip 48K v1.0 (Illumina, San Diego)—and it can be extended to analyze other microarray platforms.
1. (Shi, L. et al., Nat. Biotechnol.4, 1151–1161, 2006)
2. Lin G, He X, Ji H, Shi L, Davis RW, Zhong S."Reproducibility Probability Score--incorporating measurement variability across laboratories for gene selection" Nat Biotechnol. 2006 Dec;24(12):1476-7 PMID: 17160039