Use this field to specify a logical expression for subsetting the predictor class variables. This is useful when you want to apply a statistical filter to reduce the initial number of candidate predictor variables passed to subsequent stages of
predictive model building.
Class variables are coded using 0 and 1 values, so filtering criteria are usually in terms of proportions. The expression must be a valid SAS
WHERE clause, and it is applied separately to each predictor
class variable.
For example, to filter only diseased individuals from a data set containing a mixed population of diseased (sick) and healthy (healthy) individuals (as indicated in a column named
DiseaseStatus), you could use the following simple WHERE expression:
Note: the word
where has already been entered for you.
For example, specifying MEAN > 7 and VAR > 3 keeps only those predictor variables whose
mean value across
observations in the input data set is greater than 7 and whose
variance is greater than 3.
Refer to Definitions of Functions and CALL Routines for details about the aforementioned statistics.