In the Tiretread.jmp data table, suppose that you are interested in only predicting ABRASION as a function of the three factor variables. In this example, you fit a generalized regression model to predict ABRASION. Then, you perform bagging on that model. Last, you make a prediction for a new observation and investigate the accuracy of that prediction. This is done by obtaining a confidence interval for the prediction.
1.
|
2.
|
Select Analyze > Fit Model.
|
3.
|
4.
|
Select Generalized Regression from the Personality list.
|
5.
|
6.
|
Click Run.
|
7.
|
Click Go.
|
1.
|
Select Profilers > Profiler from the red triangle menu next to Adaptive Lasso with AICc Validation.
|
2.
|
From the red triangle menu next to Prediction Profiler, select Save Bagged Predictions.
|
5.
|
Confirm that Save Prediction Formulas is selected.
|
6.
|
Click OK.
|
Note: This might take longer to run than the Example of Bagging to Improve Prediction. The larger number of samples gives a better estimate of the prediction distributions.
Return to the data table. For each response variable, there are three new columns denoted as Pred Formula <colname> Bagged Mean, StdError <colname> Bagged Mean, <colname> Bagged Std Dev. The Pred Formula ABRASION Bagged Mean column is the final prediction.
You now have predictions for ABRASION for each observation in the data table, as well as the standard errors for those predictions. Suppose that you have an observation with new values of 0.9, 43, and 2 for SILICA, SILANE, and SULFUR, respectively. You can predict the ABRASION response and obtain a confidence interval for that prediction because the Save Prediction Formulas option saves the regression equation for each bagged model. Therefore, M predictions are made with the new factor values to create a distribution of possible predictions. The mean is the final prediction, but analyzing the distribution tells you how accurate the prediction is.
1.
|
In the data table, select Rows > Add Rows.
|
2.
|
3.
|
Under the SILICA column, type 0.9 in the box for the new row.
|
4.
|
Under the SILANE column, type 43 in the box for the new row.
|
5.
|
Under the SULFUR column, type 2 in the box for the new row.
|
Figure 2.34 Values for New Row
6.
|
Select Tables > Transpose.
|
7.
|
8.
|
Click OK.
|
9.
|
Select Analyze > Distribution.
|
10.
|
11.
|
Click OK.
|
12.
|
From the red triangle menu next to Row 21, select Display Options > Horizontal Layout.
|
Figure 2.35 Distribution Report
The Distribution Report in Figure 2.35 contains information about the distribution of the predicted values of ABRASION from each bagged model. The final prediction of ABRASION for the new observation is 104.45, which is the mean of all the M bagged predictions. This prediction has a standard error of 4.56. You can also create confidence intervals for the new prediction using the quantiles. For example, a 95% confidence interval for the new prediction is 95.89 to 113.00.