Process Description

ANOVA

The ANOVA process fits a linear model to sequential rows of observations of a tall data set. The data are assumed to be from a pre-normalized response variable and values are typically log2 transformed intensities. The process fits an ANOVA model to data from each row (or groups of rows) and creates numerous output displays. You can test hypotheses on all possible effects of each of the variables and their interactions separately.

Caution: This process can be computationally intensive for large data sets.

What do I need?

Two data sets are required for this process.

The first, the Input Data Set, contains all of the numeric data to be analyzed. For most cases, the use of normalized data, in which global effects, such as dye, chip to chip variation, and so on, have been removed, is recommended. This data set must be a tall data set.

The second data set is the Experimental Design Data Set (EDDS). This required data set tells how the experiment was performed, providing information about the columns in the input data set. Note that one column in the EDDS must be named ColumnName and the values contained in this column must exactly match the column names in the input data set. Two other columns in this data set, Array, and Experiment, correspond to an index variable and the one-way experimental variable, respectively.

An Annotation Data Set can also be specified. This data set contains information, such as gene identity, accession numbers, chromosomal location, and so on, for each of the rows in the input data set. This data set is also in the tall format; where each row corresponds to a different gene.

For detailed information about the files and data sets used or created by JMP Genomics software, see Files and Data Sets.

Output/Results

The output generated by this process is summarized in a Tabbed report. Refer to the ANOVA output documentation for detailed descriptions and guides to interpreting your results.