Use this field to specify a logical expression for subsetting the
predictor class variables
. This is useful when you want to apply a statistical filter to reduce the initial number of candidate predictor variables passed to subsequent stages of
predictive model
building.
Class variables
are coded using 0 and 1 values, so filtering criteria are usually in terms of proportions. The expression must be a valid SAS
WHERE clause
, and it is applied separately to each predictor
class variable
.
For example, to filter only diseased individuals from a data set containing a mixed
population
of diseased (sick) and healthy (healthy) individuals (as indicated in a column named
DiseaseStatus
), you could use the following simple WHERE expression:
Note:
the word
where
has already been entered for you.
For example, specifying
MEAN > 7 and VAR > 3
keeps only those predictor variables whose
mean
value across
observations
in the input data set is greater than 7 and whose
variance
is greater than 3.
Refer to
Definitions of Functions and CALL Routines
for details about the aforementioned statistics.