Process Description

Upper Quartile Scaling

The Upper Quartile Scaling normalization process applies a scaling factor based on upper quartile to scale each column. The within column upper quartile is calculated by excluding the rows of all 0 (or missing) values. Each upper quartile is further standardized by dividing by the geometric mean among all upper quartiles across columns to be the upper quartile scaling factors.

For additional details, please refer to Bullard et al. (BMC Bioinformatics 2010).

What do I need?

Two data sets are required by this process:

•

A SAS Input Data Set. The sam_mus_gse18905_chr1_6s.sas7bdat data set serves as an example, and is shown below. This is a tall data set, with 12 columns and 3,421 rows.

•

An Experimental Design Data Set (EDDS). This data set tells how the experiment was performed, providing information about the columns in the input data set. Note that one column in the EDDS must be named ColumnName and the values contained in this column must exactly match the column names in the input data set. The edf_mus_gse18905_chr1_6s.sas7bdat data set serves as an example, and is shown below.

The sam_mus_gse18905_ch1_6s.sas7bdat and edf_mus_gse18905_chr1_6s.sas7bdat data sets were downloaded from GEO.

For detailed information about the files and data sets used or created by JMP Genomics software, see Files and Data Sets.

Output/Results

Refer to the Upper Quartile Scaling output documentation for detailed descriptions of the output of this process.