This examples compares a logistic regression model and a decision tree model. The data are automobile data and the goal is to predict the size of a purchased car.
Begin by selecting Help > Sample Data Folder and opening Car Physical Data.jmp.
1. Select Analyze > Fit Model.
2. Select Type and click Y.
3. Select the following columns and click Add: Country, Weight, Turning Cycle, Displacement, and Horsepower.
4. Click Run.
The Nominal Logistic Fit report appears.
5. To save the prediction formulas to columns, click the Nominal Logistic red triangle and select Save Probability Formula.
1. Select Analyze > Predictive Modeling > Partition.
2. Select Type and click Y, Response.
3. Select the Country, Weight, Turning Cycle, Displacement, and Horsepower columns and click X, Factor.
4. Make sure that Decision Tree is selected in the Method list.
5. Click OK.
The Partition report appears.
6. Press Shift and click Split.
7. Enter 10 next to Enter Number of Splits.
8. Click OK.
9. To save the prediction formulas to columns, click the Partition red triangle and select Save Columns > Save Prediction Formula.
1. Select Analyze > Predictive Modeling > Model Comparison.
2. Select all columns that begin with Prob and click Y, Predictors.
3. Click OK.
Figure 11.7 Initial Model Comparison Report
The report shows that the Partition model has slightly higher values for Entropy RSquare and Generalized RSquare and a slightly lower value for Misclassification Rate.
4. Click the Model Comparison red triangle and select ROC Curve.
ROC curves appear for each Type, one of which is shown in Figure 11.8.
Figure 11.8 ROC Curve for Medium
Examining all the ROC curves, you see that the two models are similar in their predictive ability.
5. Click the Model Comparison red triangle and select AUC Comparison.
AUC Comparison reports appear for each Type, one of which is shown in Figure 11.9.
Figure 11.9 AUC Comparison for Medium
The report shows results for a hypothesis test for the difference between the AUC values (area under the ROC curve). Examining the results, you see there is no statistical difference between the values for any level of Type.
You conclude that there is no large difference between the predictive abilities of the two models for the following reasons:
• The R Square values and the ROC curves are similar.
• There is no statistically significant difference between AUC values.