Publication date: 07/08/2024

K-Fold and Holdback Validation

The Validation Column role provides a framework for partitioning data into cross validation sets. In addition, some JMP platforms also support K-Fold and various types of Holdback validation.

K-Fold Cross-Validation

Divides the original data into K subsets. In turn, each of the K sets is used to validate the model fit on the rest of the data, fitting a total of K models. The model that produces the best validation statistic is chosen as the final model, and the fold that is not used in the building of that model provides the validation set performance statistics shown in the report.

Note: For some platforms, you must specify K-Fold Cross-Validation in the model control panel. For other platforms, you must specify K-Fold Cross-Validation in the platform launch window. For still other platforms, you must specify K-Fold Cross-Validation through a validation column that contains more than three levels.

Random Validation Holdback

(Available as a launch option for specific platforms.) Randomly divides the original data into the training and validation sets. A test set can also be included. You can specify the proportions of the original data to use in each set.

Leave-One-Out Validation Holdback

(Available as an option for specific platforms.) Repeatedly fits the model leaving out one observation at a time. Leave-one-out validation is also known as the jackknife procedure.

Excluded Rows as Validation Holdback

Uses the excluded rows in the data table as a validation holdback set. For JMP Pro, this option is available by selecting in the platform preferences.

Note: For platforms that support using excluded rows as a validation holdback set, the excluded rows are used only when there is no validation column or validation proportion specified in the launch window.

Table A.2 K-Fold and Holdback Validation by Platform

Platform

Excluded Rows as Validation Holdback

Random Validation Holdback

Leave-One-Out Holdback

K-Fold Cross-Validation

Fit Model

Fit Least Squares

No

No

No

No

Stepwise Regression

No

No

No

Yes (for continuous response models only)

Logistic Regression

No

No

No

No

Image shown hereGeneralized Regression

No

Yes

Yes

Yes (though the model controls)

Image shown herePartial Least Squares

No

Yes

Yes

Yes (through the model controls)

Predictive Model

Neural

Yes

Yes

No

Yes (through model launch or validation column)

Partition

Yes

Yes

No

Yes (select the option in the platform preferences)

Image shown hereBootstrap Forest

Yes

Yes

No

No

Image shown hereBoosted Tree

Yes

Yes

No

No

Image shown hereK Nearest Neighbors

Yes

Yes

No

No

Image shown hereNaive Bayes

Yes

Yes

No

No

Image shown hereSupport Vector Machines

No

Yes

No

Yes (through model launch)

Specialized Models

Image shown hereFunctional Data Explorer

No

No

No

No

Multivariate Models

Discriminant

Optional

No

No

No

Partial Least Squares

No

Yes

Yes

Yes (through model launch or validation column)

Image shown hereUplift

No

Yes

No

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).