In this example, we examine the
distribution
patterns of the
Drosophila Aging Experimental Data
of Jin,
et al
. (2001) before and after normalization to evaluate the efficacy of different normalization methods.
The raw data from this experiment were evaluated using distribution analysis. The
overlay plot
(shown below) shows the raw univariate distributions of all 48 channels from the 24 arrays.
Visually, the variation among the estimated distributions originally seen for the raw data has been somewhat reduced in both the
ANOVA
normalized and the
Mixed Model
normalized data. However, there has not been a great reduction in variability, so other methods for normalizing the input data should be considered.
Control Set
normalization (below) appears to have actually increased variability between experiments and would not be appropriate for this data.
Analysis of the
Quantile
normalized data (below) showed that the variation among the estimated distributions originally seen for the raw data has been completely eliminated. Unfortunately the extreme uniformity distribution suggests that the intensity data have probably been over-corrected. Such over correction might cause you to miss significant differences and you should proceed with caution whenever you see such a uniform distribution.
Distribution analyses (below) showed that the
Loess
,
Factor Analysis
, and
Partial Least Square
normalization processes were more effective than other methods at reducing the variation among the estimated distributions originally seen for the raw data, yet were not so aggressive as to preclude a reasonable chance for observing significant differences. One or more of these methods should be used in the analysis of this data.