Validation is the process of using part of a data set to estimate model parameters and using another part to assess the predictive ability of a model. With complex data, this can reduce the risk of model overfitting.
One use for a validation column is to partition the data into two or three parts.
• The training set is used to estimate the model parameters.
• The validation set is used to help choose a model with good predictive ability.
• The testing set checks the model’s predictive ability after a model has been chosen.
Another use for a validation column is to partition the data into four or more folds to use in K-Fold crossvalidation.
A validation column can be used as a validation method in many JMP platforms, but K-Fold crossvalidation through a validation column is only support by a few platforms. See Validation in JMP Modeling.
The Make Validation Column platform enables you to create training, validation, and test sets using a variety of methods. You can specify stratification, grouping, or cutpoint columns to determine the method used to create the validation column.