1.
|
2.
|
Select Analyze > Predictive Modeling > K Nearest Neighbors.
|
3.
|
4.
|
Because one of the potential predictors, DEBTINC, has many missing values, you do not include it in your model. Missing values for continuous predictors are replaced by the average of the predictor. This procedure works well for values that are missing at random. The high missing rate of the DEBINC indicates that the missing might be informative, However, we do not investigate that in this example.
5.
|
6.
|
Click OK.
|
Figure 7.2 K Nearest Neighbors Report
For each value of K, JMP constructs a model using only the training set observations. Each of these models is used to classify the validation set observations. The validation set results are used to select a best model. In this example, the model based on the single nearest neighbor (K = 1) has the smallest misclassification rate. The test set verifies that the single nearest neighbor model is the best performer for independent data.
7.
|
8.
|
Next to Number of Neighbors, K, leave the default value of 1.
|
9.
|
Click OK.
|
The prediction equation is saved in the Formula Depot. You can compare the performance of alternative models published to the Formula Depot with that of the K = 1 nearest neighbor model using the Model Comparison option in the Formula Depot. See Formula Depot in the Predictive and Specialized Modeling book