The Validation Column role provides a framework for partitioning data into cross validation sets. In addition, some JMP platforms also support K-Fold and various types of Holdback validation.
K-Fold Cross-Validation
Divides the original data into K subsets. In turn, each of the K sets is used to validate the model fit on the rest of the data, fitting a total of K models. The model that produces the best validation statistic is chosen as the final model, and the fold that is not used in the building of that model provides the test set performance statistics.
Note: For some platforms, you must specify K-Fold Cross-Validation in the model control panel. For other platforms, you must specify K-Fold Cross-Validation in the platform launch window. For still other platforms, you must specify K-Fold Cross-Validation through a validation column that contains more than three levels.
Random Validation Holdback
(Available as a launch option for specific platforms.) Randomly divides the original data into the training and validation sets. A test set can also be included. You can specify the proportions of the original data to use in each set.
Leave-One-Out Validation Holdback
(Available as an option for specific platforms.) Repeatedly fits the model leaving out one observation at a time. Leave-one-out validation is also known as the jackknife procedure.
Excluded Rows as Validation Holdback
Uses the excluded rows in the data table as a validation holdback set. For JMP Pro, this option is available by selecting in the platform preferences.
Note: For platforms that support using excluded rows as a validation holdback set, the excluded rows are used only when there is no validation column or validation proportion specified in the launch window.
Table A.2 K-Fold and Holdback Validation by Platform
Platform |
Excluded Rows as Validation Holdback |
Random Validation Holdback |
Leave-One-Out Holdback |
K-Fold Cross-Validation |
---|---|---|---|---|
Fit Model |
|
|
|
|
Fit Least Squares |
No |
No |
No |
No |
Stepwise Regression |
No |
No |
No |
Yes (for continuous response models only) |
Logistic Regression |
No |
No |
No |
No |
Generalized Regression |
No |
Yes |
Yes |
Yes (though the model controls) |
Partial Least Squares |
No |
Yes |
Yes |
Yes (through the model controls) |
Predictive Model |
|
|
|
|
Neural |
Yes |
Yes |
No |
Yes (through model launch or validation column) |
Partition |
Yes |
Yes |
No |
Yes (select the option in the platform preferences) |
Bootstrap Forest |
Yes |
Yes |
No |
No |
Boosted Tree |
Yes |
Yes |
No |
No |
K Nearest Neighbors |
Yes |
Yes |
No |
No |
Naive Bayes |
Yes |
Yes |
No |
No |
Support Vector Machines |
No |
Yes |
No |
Yes (through model launch) |
Specialized Models |
|
|
|
|
Functional Data Explorer |
No |
No |
No |
No |
Multivariate Models |
|
|
|
|
Discriminant |
Optional |
No |
No |
No |
Partial Least Squares |
No |
Yes |
Yes |
Yes (through model launch or validation column) |
Uplift |
No |
Yes |
|
No |