Publication date: 07/08/2024

Model Fit Options

The Model Fit section of the K Nearest Neighbors report contains the following red triangle menu options:

Mosaic Plot

(Available only for nominal or ordinal responses.) Shows or hides the mosaic plot. See Mosaic Plot.

Plot Actual by Predicted

(Available only for continuous responses.) Plots the actual versus the predicted responses for the model with the smallest RASE. If there are ties for the smallest RASE, the plot is based on the model with the smallest K.

Tip: If you change the position of the slider on the solution path plot to a different K, the Actual by Predicted plot is updated to reflect the model for the chosen value of K.

Plot Residual by Predicted

(Available only for continuous responses.) Plots the residuals versus the predicted responses for the model with the smallest RASE. If there are ties for the smallest RASE, the plot is based on the model with the smallest K.

Tip: If you change the position of the slider on the solution path plot to a different K, the Residual by Predicted plot is updated to reflect the model for the chosen value of K.

Save Predicteds

Saves K predicted value columns to the data table. The columns are named Predicted <Response> <k>. The kth column contains predictions for the model based on the k nearest neighbors, where Response is the name of the response column. The report statistics that appear in the platform are generated using these raw predictions.

Save Prediction Formula

Saves a column that contains a prediction formula for a specific k nearest neighbor model. Enter a value for K when prompted. The prediction formula contains all the training data, so this option might not be practical for large data tables. This option is useful for scoring new observations or predicting missing response values.

Caution: The predicted values for the training data rows that are obtained from Save Prediction Formula and Save Predicteds do not match. The predicted values that are obtained from the Save Prediction Formula option use all of the rows in the training set, including the row for the predicted value. The predicted values that are obtained from the Save Predicteds option do not use the row for the predicted value in the training set, only all other rows. Since the Save Prediction Formula option uses all of the training data, each training data row uses the row itself for the first neighbor and has perfect prediction for k = 1. This means that any predictions for the training data might be too accurate and create inflated estimates of model accuracy if used to calculate model statistics. Therefore, the report statistics that appear in the platform are not generated using the predictions from the prediction formula.

Publish Prediction Formula

Creates a prediction formula for the specified k nearest neighbor model and saves it as a formula column script in the Formula Depot platform. If a Formula Depot report is not open, this option creates a Formula Depot report. See Formula Depot.

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).