Test Set Average AUC Range

The Test Set Average AUC Range tab is shown below:

The Test Set Average AUC Range tab contains the following elements:

•

Plot of Learning Curves

The plot shows the average area under the curve ((AUC) learning curve for each model as a solid line, and a shaded region around it indicating the range of individual curves. The width of the shading provides a measure of variability for the average learning curves. Curves with narrow bands are more reliable than those with wide bands.

Interpreting the Learning Curves

Learning curves are constructed by using a succession of different sized subsets of the full data and assessing cross validation performance on each. Sample size is plotted on the x-axis while the cross validation performance metric is plotted on the y-axis. The primary goal of this process is to determine whether adding more samples will change performance. This is achieved by inspecting the slope of the curves, especially toward the right-hand side. If the curves have a slope similar to that show in this example, it is likely that adding more samples will improve performance. If the slopes are flat, adding more samples will likely not make much of a difference.

The AUC is the area found below the Receiver Operating Characteristics (ROC) curve, which plots true-positive predictions versus false-positive predictions for a binary-response variable. The greater the AUC, the better the model is at predicting true-positive responses. See Receiver Operating Characteristics (ROC) Curves for additional information

Note: This situation is completely different from classical power and sample size calculations, which are performed in the context of statistical hypothesis testing.

Refer to AUC for more information about this statistic.