Assessing probe-specific dye and slide biases in two-color microarray data

Ruixiao Lu, Geun Cheol Lee, Michael Shultz, Chris Dardick, Kihong Jung, Jirapa Phetsom, Yi Jia, Robert H. Rice, Zelanna Goldberg, Patrick S. Schnable, Pamela Ronald, David M Rocke

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

Background: A primary reason for using two-color microarrays is that the use of two samples labeled with different dyes on the same slide, that bind to probes on the same spot, is supposed to adjust for many factors that introduce noise and errors into the analysis. Most users assume that any differences between the dyes can be adjusted out by standard methods of normalization, so that measures such as log ratios on the same slide are reliable measures of comparative expression. However, even after the normalization, there are still probe specific dye and slide variation among the data. We define a method to quantify the amount of the dye-by-probe and slide-by-probe interaction. This serves as a diagnostic, both visual and numeric, of the existence of probe-specific dye bias. We show how this improved the performance of two-color array analysis for arrays for genomic analysis of biological samples ranging from rice to human tissue. Results: We develop a procedure for quantifying the extent of probe-specific dye and slide bias in two-color microarrays. The primary output is a graphical diagnostic of the extent of the bias which called ECDF (Empirical Cumulative Distribution Function), though numerical results are also obtained. Conclusion: We show that the dye and slide biases were high for human and rice genomic arrays in two gene expression facilities, even after the standard intensity-based normalization, and describe how this diagnostic allowed the problems causing the probe-specific bias to be addressed, and resulted in important improvements in performance. The R package LMGene which contains the method described in this paper has been available to download from Bioconductor.

Original languageEnglish (US)
Article number314
JournalBMC Bioinformatics
Volume9
DOIs
StatePublished - Jul 19 2008

Fingerprint

Microarrays
Dyes
Microarray Data
Probe
Coloring Agents
Color
Normalization
Diagnostics
Microarray
Genomics
Cumulative distribution function
Numerics
Gene expression
Gene Expression
Distribution functions
Noise
Quantify
Tissue
Numerical Results
Output

ASJC Scopus subject areas

  • Medicine(all)
  • Structural Biology
  • Applied Mathematics

Cite this

Assessing probe-specific dye and slide biases in two-color microarray data. / Lu, Ruixiao; Lee, Geun Cheol; Shultz, Michael; Dardick, Chris; Jung, Kihong; Phetsom, Jirapa; Jia, Yi; Rice, Robert H.; Goldberg, Zelanna; Schnable, Patrick S.; Ronald, Pamela; Rocke, David M.

In: BMC Bioinformatics, Vol. 9, 314, 19.07.2008.

Research output: Contribution to journalArticle

Lu, R, Lee, GC, Shultz, M, Dardick, C, Jung, K, Phetsom, J, Jia, Y, Rice, RH, Goldberg, Z, Schnable, PS, Ronald, P & Rocke, DM 2008, 'Assessing probe-specific dye and slide biases in two-color microarray data', BMC Bioinformatics, vol. 9, 314. https://doi.org/10.1186/1471-2105-9-314
Lu, Ruixiao ; Lee, Geun Cheol ; Shultz, Michael ; Dardick, Chris ; Jung, Kihong ; Phetsom, Jirapa ; Jia, Yi ; Rice, Robert H. ; Goldberg, Zelanna ; Schnable, Patrick S. ; Ronald, Pamela ; Rocke, David M. / Assessing probe-specific dye and slide biases in two-color microarray data. In: BMC Bioinformatics. 2008 ; Vol. 9.
@article{a0692fcc5a89477bac9187d0d2b12c86,
title = "Assessing probe-specific dye and slide biases in two-color microarray data",
abstract = "Background: A primary reason for using two-color microarrays is that the use of two samples labeled with different dyes on the same slide, that bind to probes on the same spot, is supposed to adjust for many factors that introduce noise and errors into the analysis. Most users assume that any differences between the dyes can be adjusted out by standard methods of normalization, so that measures such as log ratios on the same slide are reliable measures of comparative expression. However, even after the normalization, there are still probe specific dye and slide variation among the data. We define a method to quantify the amount of the dye-by-probe and slide-by-probe interaction. This serves as a diagnostic, both visual and numeric, of the existence of probe-specific dye bias. We show how this improved the performance of two-color array analysis for arrays for genomic analysis of biological samples ranging from rice to human tissue. Results: We develop a procedure for quantifying the extent of probe-specific dye and slide bias in two-color microarrays. The primary output is a graphical diagnostic of the extent of the bias which called ECDF (Empirical Cumulative Distribution Function), though numerical results are also obtained. Conclusion: We show that the dye and slide biases were high for human and rice genomic arrays in two gene expression facilities, even after the standard intensity-based normalization, and describe how this diagnostic allowed the problems causing the probe-specific bias to be addressed, and resulted in important improvements in performance. The R package LMGene which contains the method described in this paper has been available to download from Bioconductor.",
author = "Ruixiao Lu and Lee, {Geun Cheol} and Michael Shultz and Chris Dardick and Kihong Jung and Jirapa Phetsom and Yi Jia and Rice, {Robert H.} and Zelanna Goldberg and Schnable, {Patrick S.} and Pamela Ronald and Rocke, {David M}",
year = "2008",
month = "7",
day = "19",
doi = "10.1186/1471-2105-9-314",
language = "English (US)",
volume = "9",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Assessing probe-specific dye and slide biases in two-color microarray data

AU - Lu, Ruixiao

AU - Lee, Geun Cheol

AU - Shultz, Michael

AU - Dardick, Chris

AU - Jung, Kihong

AU - Phetsom, Jirapa

AU - Jia, Yi

AU - Rice, Robert H.

AU - Goldberg, Zelanna

AU - Schnable, Patrick S.

AU - Ronald, Pamela

AU - Rocke, David M

PY - 2008/7/19

Y1 - 2008/7/19

N2 - Background: A primary reason for using two-color microarrays is that the use of two samples labeled with different dyes on the same slide, that bind to probes on the same spot, is supposed to adjust for many factors that introduce noise and errors into the analysis. Most users assume that any differences between the dyes can be adjusted out by standard methods of normalization, so that measures such as log ratios on the same slide are reliable measures of comparative expression. However, even after the normalization, there are still probe specific dye and slide variation among the data. We define a method to quantify the amount of the dye-by-probe and slide-by-probe interaction. This serves as a diagnostic, both visual and numeric, of the existence of probe-specific dye bias. We show how this improved the performance of two-color array analysis for arrays for genomic analysis of biological samples ranging from rice to human tissue. Results: We develop a procedure for quantifying the extent of probe-specific dye and slide bias in two-color microarrays. The primary output is a graphical diagnostic of the extent of the bias which called ECDF (Empirical Cumulative Distribution Function), though numerical results are also obtained. Conclusion: We show that the dye and slide biases were high for human and rice genomic arrays in two gene expression facilities, even after the standard intensity-based normalization, and describe how this diagnostic allowed the problems causing the probe-specific bias to be addressed, and resulted in important improvements in performance. The R package LMGene which contains the method described in this paper has been available to download from Bioconductor.

AB - Background: A primary reason for using two-color microarrays is that the use of two samples labeled with different dyes on the same slide, that bind to probes on the same spot, is supposed to adjust for many factors that introduce noise and errors into the analysis. Most users assume that any differences between the dyes can be adjusted out by standard methods of normalization, so that measures such as log ratios on the same slide are reliable measures of comparative expression. However, even after the normalization, there are still probe specific dye and slide variation among the data. We define a method to quantify the amount of the dye-by-probe and slide-by-probe interaction. This serves as a diagnostic, both visual and numeric, of the existence of probe-specific dye bias. We show how this improved the performance of two-color array analysis for arrays for genomic analysis of biological samples ranging from rice to human tissue. Results: We develop a procedure for quantifying the extent of probe-specific dye and slide bias in two-color microarrays. The primary output is a graphical diagnostic of the extent of the bias which called ECDF (Empirical Cumulative Distribution Function), though numerical results are also obtained. Conclusion: We show that the dye and slide biases were high for human and rice genomic arrays in two gene expression facilities, even after the standard intensity-based normalization, and describe how this diagnostic allowed the problems causing the probe-specific bias to be addressed, and resulted in important improvements in performance. The R package LMGene which contains the method described in this paper has been available to download from Bioconductor.

UR - http://www.scopus.com/inward/record.url?scp=48849093135&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=48849093135&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-9-314

DO - 10.1186/1471-2105-9-314

M3 - Article

VL - 9

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 314

ER -