Fitting Linear Models > Generalized Regression Models > Self-Validated Ensemble Models > Model Fit Options for Self-Validated Ensemble Models
Publication date: 07/24/2024

Model Fit Options for Self-Validated Ensemble Models

In the Generalized Regression report, the red triangle menu in the reports for self-validated ensemble models contain the following options:

Regression Reports

Enables you to customize the reports that are shown for the specified model fit.

Parameter Estimates for Original Predictors

Shows or hides the Parameter Estimates report. See Reports for Self-Validated Ensemble Models.

Covariance of Estimates

Shows or hides a matrix showing the covariances of the self-validated ensemble model parameter estimates.

Correlation of Estimates

Shows or hides a matrix showing the correlations of the self-validated ensemble model parameter estimates.

Confusion Matrix

(Available only when the specified Distribution is Binomial.) Shows or hides a matrix that tabulates the actual response levels and the predicted response levels. For a good model, predicted response levels should be the same as the actual response levels. The confusion matrix enables you to assess how the predicted responses align with the actual responses. The misclassification rate summarizes the off-diagonal results. If you specified a Validation column in the Fit Model specification window, a second matrix labeled Test is shown for the observations held out of the SVEM procedure. This Test set corresponds to the combined Validation and Test sets in the Validation column.

Set Probability Threshold

(Available only when the specified Distribution is Binomial.) Specify a cutoff probability for classifying the response. By default, an observation is classified into the Target Level when its predicted probability exceeds 0.5. Change the threshold to specify a value other than 0.5 as the cutoff for classification into the Target Level. The Predicted Rate in the confusion matrix and the misclassification rate are updated to reflect classification according to the specified threshold.

If the response has a Profit Matrix column property, the initial value for the probability threshold is determined by the profit matrix.

Profilers

(Not available for models that contain a predictor that has the Vector modeling type.) Enables you to explore the self-validated ensemble model with a prediction profiler.

Profiler

Shows or hides the Prediction Profiler. Predictors that have parameter estimates of zero and that are not involved in any interaction terms with nonzero coefficients do not appear in the profiler. For more information about the prediction profiler, see “Profiler” in Profilers.

Diagnostic Plots

Provides various plots to help assess how well the self-validated ensemble model fits.

Plot Actual by Predicted

(Not available when the specified Distribution is Binomial.) Plots actual Y values on the vertical axis and predicted Y values on the horizontal axis. If you specified a Validation column in the Fit Model specification window, a second plot labeled Test is shown for the observations held out of the SVEM procedure. This Test set corresponds to the combined Validation and Test sets in the Validation column.

Plot Residual by Predicted

(Not available when the specified Distribution is Binomial.) Plots the residuals on the vertical axis and the predicted Y values on the horizontal axis. If you specified a Validation column in the Fit Model specification window, a second plot labeled Test is shown for the observations held out of the SVEM procedure. This Test set corresponds to the combined Validation and Test sets in the Validation column.

ROC Curve

(Available only when the specified Distribution is Binomial.) Shows or hides the Receiver Operating Characteristic (ROC) curve. If you specified a Validation column in the Fit Model specification window, the ROC Curve plot corresponds to the Training set in the Validation column. See “ROC Curve” in Predictive and Specialized Modeling.

Precision Recall Curve

(Available only when the specified Distribution is Binomial.) Shows or hides the Precision-Recall Curve plot. A precision-recall curve plots the precision values against the recall values at a variety of thresholds. If you specified a Validation column in the Fit Model specification window, the Precision-Recall Curve plot corresponds to the Training set in the Validation column. See “Precision-Recall Curve” in Predictive and Specialized Modeling.

Lift Curve

(Available only when the specified Distribution is Binomial.) Shows or hides the lift curve for the model. If you specified a Validation column in the Fit Model specification window, the Lift Curve plot corresponds to the Training set in the Validation column. See “Lift Curve” in Predictive and Specialized Modeling.

Decision Threshold

(Available only when the specified Distribution is Binomial.) Shows or hides Decision Thresholds reports for the training, validation, and test sets, if specified. Each report contains a graph of the distribution of fitted probabilities for each model, confusion matrices for each model, and classification graphs to compare the model fits. See “Decision Thresholds Report” in Predictive and Specialized Modeling for more information about the Decision Thresholds report.

Save Columns

Enables you to save columns based on the fitted model to the data table.

Save Prediction Formula

(Available only when the specified Distribution is Normal.) Saves a new column to the original data table. The new column contains the prediction formula for the self-validated ensemble model. The prediction formula does not contain zeroed terms. See Statistical Details for Distributions for mean formulas. The prediction formula includes the application of the debiasing intercept and slope estimates.

Save Resample Formulas

Saves multiple formula columns to the original data table. A column group called SVEM Samples contains one formula column per individual model. These columns are saved as hidden columns. The prediction formulas include the application of the debiasing intercept and slope estimates. The next column is a prediction formula for the self-validated ensemble model. This is the average prediction across the individual self-validated ensemble models. Then there is a column that contains the standard error formula for the self-validated ensemble model. The final column contains the median prediction across the individual self-validated ensemble models for each row.

Publish Prediction Formula

(Available only when the specified Distribution is Normal.) Creates a prediction formula and saves it as a formula column script in the Formula Depot platform. The prediction formula includes the application of the debiasing intercept and slope estimates. If a Formula Depot report is not open, this option creates a Formula Depot report. See “Formula Depot” in Predictive and Specialized Modeling.

Remove Fit

Removes the report for the fit.

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).