Traditionally, stepwise regression has not addressed the situation where there are categorical effects in the model. Note the following:
• When a regression model contains nominal or ordinal effects, those effects are represented by sets of indicator columns.
• When a categorical effect has only two levels, that effect is represented by a single column.
• When a categorical effect has k levels, where k > 2, then it must be represented by k-1 columns.
The convention in JMP for standard platforms is to represent nominal variables by terms whose parameter estimates average to zero across all the levels.
In the Stepwise platform, categorical variables (nominal and ordinal) are coded in a hierarchical fashion. This differs from coding in other least squares fitting platforms. In hierarchical coding, the levels of the categorical variable are successively split into groups of levels that most separate the means of the response. The splitting process achieves the goal of representing a k-level categorical variable by k - 1 terms.
Note: In hierarchical coding, the initial terms that are constructed represent the groups responsible for the greatest separation. The advantage of this coding scheme is that these informative terms have the potential to enter the model early.