For this example, consider a study of patients who have liver cancer. Based on various measurements and markers, you want to classify patients according to their disease severity (high or low). There are two errors that one can make in classification of patients: classifying a subject who has high severity into the low group, or classifying a patient with low severity into the high group. Clinically, the misclassification of a high patient as low is a costly error, as that patient might not receive the aggressive treatment needed. Classifying a patient with low severity into the high severity group is a less costly error. That patient might receive the more aggressive treatment than needed, but this is not a major concern.
In this example, you define a profit matrix in the context of a liver cancer study and obtain a Decision Matrix report. The Decision Matrix report helps you assess your classification rates relative to the costs in your profit matrix.
1. Select Help > Sample Data Library and open Liver Cancer.jmp.
2. Select Analyze > Predictive Modeling > Partition.
3. Select Severity and click Y, Response.
4. Select BMI through Jaundice and click X, Factor.
5. Select a validation procedure based on your JMP installation:
– For JMP Pro, select Validation and click Validation.
– For JMP, enter 0.3 as the Validation Proportion.
Note: Results using the validation proportion can differ from those shown here, due to the random selection of validation rows.
Figure 4.25 Completed Launch Window with Validation Portion = 0.3
6. Click OK.
7. Press Shift and click Split.
8. Enter 10 for the number of splits and click OK.
Check that the Number of Splits is 10 in the panel beneath the plot.
9. Click the red triangle next to Partition for Severity and select Specify Profit Matrix.
10. Change the entries to the following values:
– Enter 1 in the High, High box.
– Enter -5 in the High, Low box.
– Enter -3 in the Low, High box.
– Enter 1 in the Low, Low box.
Figure 4.26 Completed Profit Matrix
Tip: You can save this profit matrix as a column property for use in later analyses. Select the check box “Save to column as property” at the bottom of the profit matrix window.
Note the following:
– Each value of 1 reflects your profit when you make a correct decision.
– The -3 value indicates that if you classify a Low severity patient as High severity, your loss is 3 times as much as the profit of a correct decision.
– The -5 value indicates that if you classify a High severity patient as Low severity, your loss is 5 times as much as the profit of a correct decision.
11. Click OK.
12. Click the red triangle next to Partition for Severity and select Show Fit Details.
Figure 4.27 Confusion Matrix and Decision Matrix Reports
The Confusion Matrix and Decision Matrix reports follow the list of Measures in the Fit Details report. Notice that the Confusion Matrix report and the confusion matrices in the Decision Matrix report show different counts. This is because the weighting in the profit matrix results in different decisions than do the predicted probabilities without weighting.
The Confusion Matrix for the validation set shows classifications based on predicted probabilities alone. Based on these, 11 High severity patients would be classified as Low severity and 5 Low severity patients would be classified as High severity.
The Decision Matrix report incorporates the profit matrix weights. Using those weights, only 6 High severity patients are classified as Low severity. However, this comes at the expense of misclassifying 6 Low severity patients into the High severity group (1 additional patient).
13. Click the red triangle next to Partition for Severity and select Save Columns > Save Prediction Formula.
Eight columns are added to the data table.
Tip: To quickly return to the data table, click the View Associated Data icon in the bottom right corner of the report window (Windows) or the Show Data Table icon on the tool bar menu (macOS).
– The first three columns involve only the predicted probabilities. The confusion matrix counts are based on the Most Likely Severity column, which classifies a patient into the level with the highest predicted probability. These probabilities are given in the Prob(Severity == High) and Prob(Severity == Low) columns.
– The last five columns involve the profit matrix weighting. The column called Most Profitable Prediction for Severity contains the decision based on the profit matrix. The decision for a patient is the level that results in the largest profit. The profits are given in the Profit for High and Profit for Low columns.