Publication date: 07/08/2024

Reports for Self-Validated Ensemble Models

In the Generalized Regression report, the model fit reports for self-validated ensemble models contain sections that describe the model fit.

Parameter Estimates

The Parameter Estimates section contains estimates and other results for all parameters in the ensemble model. The first table contains the estimates of the debiasing parameters and the coefficients for the predictors in the model. An additional table includes other model parameters such as scale, dispersion, or zero inflation parameters, if the specified Distribution contains additional model parameters. See Specify a Distribution.

The first table contains estimates for the intercept and slope that are used to debias the SVEM predictions. These estimates are obtained by fitting a regression model of the response versus the linear predictor that uses the ensemble model parameter estimates. The regression model uses the same type of regression that was specified in the Model Launch options. These intercept and slope estimates also appear in the saved formulas for the SVEM predictions.

Note: The debiasing estimates are not applied to the values in the resampling estimates table.

The tables for the model terms and other model parameters both contain the following information:

Term

A list of the model terms. “Forced in” appears next to any terms that were forced into the model using the Advanced Controls option.

Resampling Estimate

The average of the parameter estimates of the model term in the individual models. For the normal distribution, this average value is the parameter estimate in the ensemble model.

Resampling Std Dev

The standard deviation of the estimates of the model term in the individual models. For the normal distribution, this value is the estimate of the parameter standard deviation in the ensemble model.

Percent Nonzero

(Does not appear in the other model parameters table in the Parameter Estimates section.) The percent of individual models that contain a nonzero estimate for each model term. Model terms that are not included in as many of the individual models are less important than model terms that are included in more of the individual models.

Sample Fit Quality

The Sample Fit Quality section contains a table of summary statistics for each of the individual models in the ensemble. The table contains the following columns:

Sample

The number of the individual model.

Nonzero Predictors

The number of predictors in each individual model that are nonzero.

Training MSE

(Not available when the specified Distribution is Binomial.) The mean square error for the training set in each individual model.

Validation MSE

(Not available when the specified Distribution is Binomial.) The mean square error for the validation set in each individual model.

Sample Parameter Estimates

The Sample Parameter Estimates section contains a table of the parameter estimates in the individual models. Each row of the table corresponds to an individual model in the ensemble. The first column contains the number of the individual model and the remaining columns correspond to the model terms.

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).