The Prediction Profiler red triangle menu contains the following options:
Optimization and Desirability
Submenu that consists of the following options:
Desirability Functions
Shows or hides the desirability functions. Desirability is discussed in Desirability Profiling and Optimization.
Maximize Desirability
Sets the current factor values to maximize the desirability functions. Takes into account the response importance weights.
Note: In many situations, the settings that optimize the desirability function are not unique. The Maximize Desirability option gives one such setting. The Contour Profiler is a good tool for finding alternative factor combinations that optimize desirability. For an example, see Additional Example of the Contour Profiler.
Note: If a factor has a Design Role column property value of Discrete Numeric, it is treated as continuous in the optimization of the desirability function. To account for the fact that the factor can assume only discrete levels, it is displayed in the profiler as a categorical term and an optimal allowable level is selected.
Maximize and Remember
Maximizes the desirability functions and remembers the associated settings.
Maximization Options
Opens the Maximization Options window where you can refine the optimization settings. See Maximization Options Window.
Maximize for Each Grid Point
Used only if one or more factors are locked. The ranges of the locked factors are divided into a grid, and the desirability is maximized at each grid point. This is useful if the model that you are profiling has categorical factors. Then the optimal condition can be found for each combination of the categorical factors.
Save Desirabilities
Saves the three desirability function settings for each response, and the associated desirability values, as a Response Limits column property in the data table. These correspond to the coordinates of the handles in the desirability plots.
Set Desirabilities
Opens the Response Goal window where you can set specific desirability values.
Figure 3.5 Response Goal Window
Save Desirability Formula
Creates a column in the data table with a formula for Desirability. The formula uses the fitting formula when it can, or the response variables when it cannot access the fitting formula.
Assess Variable Importance
Provides different approaches to calculating indices that measure the importance of factors to the model. These indices are independent of the model type and fitting method. See Assess Variable Importance.
Save Bagged Predictions
(Available only when the Prediction Profiler is embedded in select modeling platforms.) Launches the Bagging window. Bootstrap aggregating (bagging) enables you to create multiple training data sets by sampling with replacement from the original data. For each training set, a model is fit using the analysis platform, and predictions are made. The final prediction is a combination of the results from all of the models. This improves prediction performance by reducing the error from variance. See Bagging.
Simulator
Launches the Simulator. The Simulator enables you to create Monte Carlo simulations using random noise added to factors and predictions for the model. A typical use is to set fixed factors at their optimal settings, and uncontrolled factors and model noise to random values. You then find out the rate of responses outside the specification limits. See “Simulator”.
Design Space Profiler
(Available only if there is at least one continuous factor.) Shows or hides the Design Space Profiler. Use the Design Space Profiler to determine operation limits for the factors that honor the specification limits on the response variables. See Design Space Profiler.
Interaction Profiler
Shows or hides interaction plots that update as you update the factor values in the Prediction Profiler. Use this option to visualize third degree interactions by seeing how the plot changes as current values for the factors change. The cells that change for a given factor are the cells that do not involve that factor directly. If there is more than one response, there is a separate tab in the Interactions Profilers report for each response.
Overlaid Interactions
Shows or hides faded curves in the Prediction Profiler plots. The faded curves represent the profilers for different types of interactions among the ranges of the factors. This option provides a static view of the interactions without needing to change factor values in the profiler. For continuous factors, interaction curves are created for factor values that are evenly spaced over the specified factor range. For categorical factors, interaction curves are created for each level of the factor.
When this option is selected, the bold black line represents the profiler curve calculated using the current factor values. The light gray lines represent the interaction curves that are defined by Combinations and Spanning Range options. If you hover over an interaction curve, the curve is highlighted and the corresponding factor values are shown by gray tick marks on the other factor axes. If you click on a highlighted curve, the factor values are updated to match those that were used to create the highlighted interaction curve.
Overlaid Interactions Options
(Available only when the Overlaid Interactions option is selected.) Shows a submenu of options that define the overlaid interaction curves in the Prediction Profiler plots.
Combinations
Shows a submenu of options that specify the types of interaction curves that are shown in the profiler.
Mixed
Plots interaction curves for all possible combinations of categorical factors and one-at-a-time changing of continuous factors. If there are no categorical factors, this is the same as the Two-way option.
Two-way
Plots interaction curves where each curve corresponds to a single factor changing while all other factors are held constant. This option is useful for seeing two-way interactions between factors. This is the default option.
Many-way
Plots interaction curves for all possible combinations of other factors.
Spanning Range
Shows a submenu of options that specify how the sampling range of each continuous factor is determined. The sampling range for each factor defines the lowest and highest values for which the interaction curves are created. The interaction curves are then created for factor values that are equally spaced over the factor range so that the total number of interaction curves matches the value defined by the Samples per Factor option.
Inner axis range
Defines the range of each factor as the inner 80% of the factor’s axis range. By default, this is the 10% quantile and the 90% quantile of each factor’s range. The range is updated if the factor axis is updated (through zooming or panning).
Full axis range
Defines the range of each factor using the lowest and highest values on the axis of the factor. By default, this is the lowest and highest value of each factor. The range is updated if the factor axis is updated (through zooming or panning).
One Standard Deviation
Defines the range of each factor as the mean of the factor plus or minus one standard deviation.
Two Standard Deviations
Defines the range of each factor as the mean of the factor plus or minus two standard deviations.
Data Range
Defines the range of each factor using the lowest and highest factor values in the data.
Samples per Factor
Specifies the number of sample values that are taken for each continuous factor. These sample values are used to create the interaction curves on the profiler graphs. The default value is 6. If the Many-way option is selected, the number of sample values taken for each continuous factor is floor(Samples per Factor*2/3).
Data Points
Shows or hides the individual data points in the Prediction Profiler plot. The data points appear faded based on their distance from the plane of each profiler. Data points that are farther away from the plane of the profiler appear lighter than data points that are closer to the plane of the profiler. Use this option to provide a visual diagnostic that determines how much support a profiler curve has in the data.
Tip: To select a point in the plot, use the brush tool in the toolbar.
Confidence Intervals
Shows or hides confidence intervals in the Prediction Profiler plot. The intervals are error bars for categorical factors and curves for continuous factors. Use the intervals for each predictor to assess the impact of that predictor on the confidence in the prediction of the response. The confidence interval for the response is displayed on the vertical axis in blue. Confidence intervals are available when the profiler is used inside certain fitting platforms, when confidence interval formulas are saved to the data table, or when a standard error column of the form PredSE<colname> has been specified in the Y, Prediction Formula role of the Prediction Profiler launch window.
Prediction Intervals
Shows or hides prediction intervals in the Prediction Profiler plot. Prediction intervals are calculated by including both the variation in estimating the model and the variation in the residual error. The intervals are error bars for categorical factors and curves for continuous factors. Use the intervals for each predictor to assess the impact of that predictor on the confidence in the prediction of a new observation. The prediction interval for a new observation is displayed on the vertical axis in green. Prediction intervals are available when the profiler is used inside certain fitting platforms or when there are prediction interval formulas saved to the data table.
Note: Prediction intervals are wider than confidence intervals. Prediction intervals are for a new observation not used in the construction of the model.
Prop of Error Bars
(Appears when a Sigma column property exists in any of the factor and response variables.) Shows or hides the 3σ interval that is implied on the response due to the variation in the factor. The interval values are also displayed on the vertical axis in green. Propagation of error (POE) is important when attributing the variation of the response in terms of variation in the factor values when the factor values are not very controllable. See Statistical Details for Propagation of Error Bars.
Sensitivity Indicator
Shows or hides a purple triangle whose height and direction correspond to the value of the partial derivative of the profile function at its current value. This is useful in large profiles to be able to quickly spot the sensitive cells.
Figure 3.6 Sensitivity Indicators
Profile at Boundary
When analyzing a mixture design, JMP constrains the ranges of the factors so that settings outside the mixture constraints are not possible. This is why, in some mixture designs, the profile traces turn abruptly.
When there are mixture components that have constraints, other than the usual zero-to-one constraint, a new submenu, called Profile at Boundary, appears on the Prediction Profiler red triangle menu. It has the following two options:
Turn At Boundaries
Lets the settings continue along the boundary of the restraint condition.
Stop At Boundaries
Truncates the prediction traces to the region where strict proportionality is maintained.
Extrapolation Control
Shows a submenu of options for extrapolation control. This feature helps identify possible extrapolated predictions. A prediction is considered an extrapolation when it is made using a combination of factor points that are not within the factor space of the original data. In the extrapolation control feature, the metric used to determine whether a point is an extrapolation depends on the type of model fit. For models that are fit in the Standard Least Squares personality of the Fit Model platform, the leverage at the factor settings is used as the default extrapolation metric. For all other models, the regularized Hotelling’s T2 value is used as the default extrapolation metric. In all models, there is an option to use the K Nearest Neighbors extrapolation metric. See Statistical Details for Extrapolation Control Metrics.
Extrapolation Control is available in profilers embedded in the following platforms: Fit Least Squares, Neural, Naive Bayes, Partial Least Squares, Support Vector Machines, Structural Equation Models, and Generalized Regression. It is also available in profilers launched from the Graph menu. If Extrapolation Control is not available in a particular platform, you can save the prediction formula to the data table and launch the Profiler from the Graph menu. The data used for the extrapolation control metrics depends on the type of profiler.
– When a model is built with validation, the embedded profiler and extrapolation control metrics are based on the training data.
– If you launch a profiler from the Graph menu the extrapolation control metrics are based on all data, unless you specifically exclude certain rows.
– When a model is built in a platform that ignores missing values during model fitting, those rows are excluded from the embedded profiler and extrapolation control metrics.
– When a model is built with Informative Missing, the embedded profiler and extrapolation control metrics reflect the informative missing.
– To include informative missing in the extrapolation control metrics when launching the profiler from the Graph menu use the Informative Missing column property.
– If you call Extrapolation Control from a profiler launched from the Graph menu, the regularized Hotelling’s T2 value is used as the default extrapolation metric, regardless of the type of model fit. Therefore, the extrapolation control results from a profiler embedded in the Standard Least Squares platform might not match those from the Graph menu profiler.
The extrapolation control red triangle menu includes options to either warn of possible extrapolation or to restrict the factor settings so that extrapolated predictions are not shown.
Off
Turns off all extrapolation controls and warnings.
On
Turns on extrapolation control. When this option is selected, it is indicated at the top of the profiler and the profile traces are restricted to factor combinations that do not lead to extrapolations.
Warning On
Turns on extrapolation warnings. When this option is selected, it is indicated at the top of the profiler. If the selected factor combination produces an extrapolation, an alert appears that reads --Possible Extrapolation--.
Extrapolation Type Option
Provides a submenu that enables you to select which extrapolation metric is used. For models that are fit with the Standard Least Squares personality of the Fit Model platform, the options are Leverage and K Nearest Neighbors. For all other models, the options are Regularized T2 and K Nearest Neighbors. See Statistical Details for Extrapolation Control Metrics.
Extrapolation Details
Shows or hides the extrapolation control details above the prediction profiler. The extrapolation control details include the value of the extrapolation metric at the current point, the value of the extrapolation threshold, the type of extrapolation metric, and the definition of the extrapolation threshold.
Set Threshold Criterion
Opens a window that enables you to adjust the extrapolation threshold. When the extrapolation metric is the leverage at the factor settings, you can specify how the leverage is computed and the value of the corresponding multiplier. When the extrapolation metric is the regularized Hotelling’s T2 value, you can specify the multiplier. See Statistical Details for Extrapolation Control Metrics.
Note: This option is not applicable for the K Nearest Neighbors extrapolation metric.
Reset Factor Grid
Displays a window for each factor enabling you to enter a specific value for the factor’s current setting, to lock that setting, and to control aspects of the grid. See the section Set or Lock Factor Values.
Figure 3.7 Factor Settings Window
Factor Settings
Submenu that consists of the following options:
Remember Settings
Adds an outline node to the report that accumulates the values of the current settings each time the Remember Settings command is invoked. Each remembered setting is preceded by a radio button that is used to reset to those settings. There are options to remove selected settings or all settings in the Remembered Settings red triangle menu. The names of the remembered settings are also customizable.
Note: If you launch the profiler from a model fitting platform that performs variable selection, this option assigns missing values to factors that are not included in the model.
Set To Data in Row
Assigns the values of a data table row to the X variables in the Prediction Profiler.
Copy Settings Script
Copies the current Prediction Profiler’s settings to the clipboard.
Paste Settings Script
Pastes the Prediction Profiler settings from the clipboard to a Prediction Profiler in another report.
Append Settings to Table
Appends the current profiler’s settings to the end of the data table. This is useful if you have a combination of settings in the Prediction Profiler that you want to add to an experiment in order to do another run.
Note: If you launch the profiler from a model fitting platform that performs variable selection, this option assigns missing values to factors that are not included in the model.
Broadcast Factor Settings
Sends the current profiler’s factor settings to all other profilers, but does not link the profilers. A change in a factor in one profiler does not cause changes in any other profilers unless Broadcast Factor Settings is selected again.
Link Profilers
Links all the profilers together. A change in a factor in one profiler causes that factor to change to that value in all other profilers, including Surface Plot. This is a global option, set, or unset for all profilers.
Set Script
Sets a script that is called each time a factor changes. The set script receives a list of arguments of the form:
{factor1 = n1, factor2 = n2, ...}
For example, to write this list to the log, first define a function:
ProfileCallbackLog = Function({arg},show(arg));
Then enter ProfileCallbackLog in the Set Script dialog.
Similar functions convert the factor values to global values:
ProfileCallbackAssign = Function({arg},evalList(arg));
Or access the values one at a time:
ProfileCallbackAccess = Function({arg},f1=arg["factor1"];f2=arg["factor2"]);
Unthreaded
Enables you to change to an unthreaded analysis if multithreading does not work.
Animation
Shows or hides animation controls that enable you to easily cycle through a variety of factor settings. See Animation Controls.
Default N Levels
Enables you to set the default number of levels for each continuous factor. This option is useful when the Prediction Profiler is especially large. When calculating the traces for the first time, JMP measures how long it takes. If this time is greater than three seconds, you are alerted that decreasing the Default N Levels speeds up the calculations. The maximum value for Default N Levels is 1000.
Output Grid Table
Produces a new data table with columns for the factors that contain grid values, columns for each of the responses with computed values at each grid point, and the desirability computation at each grid point. If any of the factors or responses have specification limits, there are columns that indicate whether the row is within the specification limits. If you launch the Prediction Profiler from a platform that supports confidence intervals and prediction intervals, these intervals are also included as columns in the data table. The new data table contains scripts that can be used to visualize the in-spec regions of the factors or responses.
If you have a large number of factors, a memory allocation message might be displayed for the grid table. In such cases, you could lock some of the factors, which are held at the locked, constant values in the grid table. To get the window to specify locked columns, ALT- or Option-click inside the profiler graph to get a window that has a Lock Factor Setting check box.
Output Random Table
Creates a data table of random factor settings and predicted values over those settings. If any of the factors or responses have specification limits, there are columns that indicate whether the row is within the specification limits. If you launch the Prediction Profiler from a platform that supports confidence intervals and prediction intervals, these intervals are also included as columns in the data table. The new data table contains scripts that can be used to visualize the in-spec regions of the factors or responses.
When you select the Output Random Table option, you are prompted to specify the number of runs. There is also an option to add random noise to one or more of the responses using the specified Std Dev values. This option adds a normal random value with mean zero and specified standard deviation to the predicted response. If the response column contains a Predicting column property that includes a Std Dev value, the Std Dev value is automatically populated with the value from the column property.
This option is a simpler equivalent to opening the Simulator, resetting all the factors to a random uniform distribution, then simulating responses (with or without added random noise).
The prime reason to make uniform random factor tables is to explore the factor space in a multivariate way using graphical queries. This technique is called Filtered Monte Carlo.
Suppose you want to see the locus of all factor settings that produce a given range to desirable response settings. By selecting and hiding the points that do not qualify (using graphical brushing or the Data Filter), you see the possibilities of what is left: the opportunity space yielding the result that you want.
Some rows might appear selected and marked with a red dot. These represent the points on the multivariate desirability Pareto Frontier - the points that are not dominated by other points with respect to the desirability of all the factors. The selected rows correspond to rows that have a value of 1 in the Dominant column.
Shapley Values
A submenu of options to calculate Shapley values. Shapley values explain individual predictions of a model. For each independent variable, xj, a vector of Shapley values, φj, is calculated so that there is a value for each individual prediction. These values give the contribution of the independent variable to a prediction compared to the average prediction of the model fit on the background data set. Shapley values are additive and each prediction can be written as a sum of the Shapley values plus the average prediction. The average prediction is referred to as the Shapley Intercept. For more information about Shapley values, see Shapley (1953) and Lundberg and Lee (2017).
The Profiler uses the Permutation SHAP method to calculate the Shapley values. See Lundberg (2018).
Save Shapley Values
Adds a new column to the original data table for each independent variable in the predictive model. Each new column contains the Shapley values for that factor, calculated using the current estimation option settings. There is also a hidden column for the Shapley Intercept. By default, Shapley values are not calculated for rows that are excluded in the data table.
Set Shapley Values Options
Opens a window that enables you to specify options for the calculation of the Shapley values.
Background Data Choice
Specifies how much background data is used in the calculation of the Shapley values. You can specify a percentage of the training data or a specific number of rows in the training data. The Shapley values calculation uses all of the training data by default.
Shapley Estimation Method Options
Provides options to specify the number of permutations used in Permutation SHAP and to set a random seed for reproducibility. By default, the number of permutations is 10.
There is an option to calculate the Shapley values for all rows, including excluded rows. There is also an option to add a script to the data table that produces graphs of the Shapley values. A script is added for each response variable.
Click OK to save the option settings. Click OK and Run to save the option settings and calculate the Shapley values.
Alter Linear Constraints
Enables you to add, change, or delete linear constraints. The constraints are incorporated into the operation of the profiler. You can also view and change the bounds for each continuous factor. See Linear Constraints.
Save Linear Constraints
Enables you to save existing linear constraints to a table script called Constraint. See Linear Constraints.
Conditional Predictions
Appears when random effects are included in the model. The random effects predictions are used in formulating the predicted value and profiles.
Appearance
Submenu that consists of the following options:
Arrange in Rows
Enter the number of plots that appear in a row. This option helps you view plots vertically rather than in one wide row.
Note: To set a default number of plots to appear in a row, go to File > Preferences > Platforms > Profiler and edit the Arrange in Rows preference.
Graph Spacing
Opens a window that enables you to set the amount of horizontal space between graph panels.
Reorder X Variables
Opens a window where you can reorder the model main effects by dragging them to the desired order.
Reorder Y Variables
Opens a window where you can reorder the responses by dragging them to the desired order.
Hide Y Variables
(Available only for continuous responses.) Specifies the response variables that you would like to show or hide in the profiler.
Adapt Y Axis
Re-scales the vertical axis if the response is outside the axis range, so that the range of the response is included.
Show Creator
Shows or hides the name of the platform that created the formula in the response column. The platform name appears on the vertical axis. (Available only if the response column contains a “Creator” named argument in the “Predicting” column property.)
Figure 3.8 Animation Controls
Play/Pause
Press play to animate the profiler. Moves through a cycle of factor settings and loops back to the beginning when the cycle is complete. Press pause to stop the animation.
Cycle Type
Lists the types of cycles for factor settings.
Sequential
Cycles through values for each factor, one factor after another. The name of the factor the animation is currently cycling through is displayed next to the speed slider bar.
Single Factor
Cycles through values for the selected factor while all other factors are held constant. The name of the selected factor is displayed next to the speed slider bar.
Random
Randomly cycles through different combinations of factor settings/values.
Data Sequential
Sets the factor values to a row in the data table, one row at a time, starting with row 1. The row number of the current factor setting is displayed next to the speed slider bar.
Data Random
Sets the factor values to a random row in the data table, one row at a time. The row number of the current factoring setting is displayed next to the speed slider bar.
Speed Slider Bar
Use the slider bar to adjust the speed of the animation.