•
|
•
|
•
|
Computes parameter estimates using ridge regression. Ridge regression is a biased regression technique that applies an l2 penalty and does not result in zero parameter estimates. It is useful when you want to retain all predictors in your model. For more details, see Ridge Regression.
Computes parameter estimates by applying an l1 penalty. Due to the l1 penalty, some coefficients can be estimated as zero. Thus, variable selection is performed as part of the fitting procedure. In the ordinary Lasso, all coefficients are equally penalized.
Computes parameter estimates by penalizing a weighted sum of the absolute values of the regression coefficients. The weights in the l1 penalty are determined by the data in such as way as to guarantee the oracle property (Zou, 2006). This option uses the MLEs to weight the l1 penalty. MLEs cannot be computed when the number of predictors exceeds the number of observations or when there are strict linear dependencies among the predictors. If MLEs for the regression parameters cannot be computed, a generalized inverse solution or a ridge solution is used for the l1 penalty weights. See Adaptive Methods.
The Lasso and the adaptive Lasso options generally choose parsimonious models when predictors are highly correlated. These techniques tend to select only one of a group of correlated predictors. High-dimensional data tend to have highly correlated predictors. For this type of data, the Elastic Net might be a better choice than the Lasso. For more information, see Lasso Regression.
Computes parameter estimates by applying both an l1 penalty and an l2 penalty. The l1 penalty ensures that variable selection is performed. The l2 penalty improves predictive ability by shrinking the coefficients as ridge does.
Computes parameter estimates using an adaptive l1 penalty as well as an l2 penalty. This option uses the MLEs to weight the l1 penalty. MLEs cannot be computed when the number of predictors exceeds the number of observations or when there are strict linear dependencies among the predictors. If MLEs for the regression parameters cannot be computed, a generalized inverse solution or a ridge solution is used for the l1 penalty weights. You can set a value for the Elastic Net Alpha in the Advanced Controls panel. See Adaptive Methods.
The Elastic Net tends to provide better prediction accuracy than the Lasso when predictors are highly correlated. (In fact, both Ridge and the Lasso are special cases of the Elastic Net.) In terms of predictive ability, the adaptive Elastic Net often outperforms both the Elastic Net and the adaptive Lasso. The Elastic Net has the ability to select groups of correlated predictors and to assign appropriate parameter estimates to the predictors involved. For more information, see Elastic Net.
Computes parameter estimates in two stages. In the first stage, a Lasso model is fit to determine the terms to be used in the second stage. In the second stage, an adaptive Lasso model is fit using the terms from the first stage. The second stage considers only the terms that are included in the first stage model. The results that are shown are for the second-stage fit. If none of the variables enters the model in the first stage, there is no second stage, and the results of the first stage appear in the report. See Adaptive Methods.
The solution paths for the Lasso and Ridge Estimation Methods depend on a single tuning parameter. The solution path for the Elastic Net depends on a tuning parameter for the penalty on the likelihood as well as the Elastic Net Alpha. The penalty on the likelihood for the Elastic Net is a weighted sum of the penalties associated with the Lasso and Ridge Estimation Methods. The Elastic Net Alpha determines the weights of these two penalties. See Statistical Details for Estimation Methods and Statistical Details for Advanced Controls.
The grid of tuning parameter values ranges from zero, in most cases, to the smallest value for which all of the non-intercept terms are zero. Define the smallest value of the tuning parameter for which all non-intercept terms are zero to be its upper bound. The lower bound for the tuning parameter is zero except in the following two cases where it is set to 0.01:
Requires lower-order effects to enter the model before their related higher order effects. In most cases, this means that X2 is not in the model unless X is in the model. For estimation methods other than Forward Selection, however, it is possible for X2 to enter the model and X to leave the model in the same step. If the data table contains a DOE script, this option is enabled, but it is off by default.
Sets the α parameter for the Elastic Net. This α parameter determines the mix of the l1 and l2 penalty tuning parameters in estimating the Elastic Net coefficients. The default value is α = 0.9, which sets the coefficient on the l1 penalty to 0.9 and the coefficient on the l2 penalty to 0.1. This option is available only when Elastic Net is selected as the Estimation Method. See Statistical Details for Estimation Methods.
Provides options for choosing the distribution of the grid scale. You can choose between a linear, square root, or log scale. Grid points equal in number to the specified Number of Grid Points are distributed according to the selected scale between the lower and upper bounds of the tuning parameter. See Statistical Details for Advanced Controls.
Provides options for choosing the solution in the first stage of the Double Lasso and Two Stage Forward Selection. By default, the solution that is the best fit according to the specified Validation Method is selected and is the solution initially shown (Best Fit). You can choose to initially display models with larger or smaller l1 norm values that lie in the green or yellow zones. For example, if you choose Smallest in Yellow Zone, the initially displayed solution is the model in the yellow zone that has the smallest l1 norm. See Comparable Model Zones.
Provides options for choosing the solution that is initially displayed as the current model in the Solution Path report. The current model is identified by a solid vertical line. See Current Model Indicator. The best fit solution is identified by a dotted vertical line. By default, the displayed solution is the one that is considered the best fit according to the specified Validation Method.
You can choose to initially display models with larger or smaller l1 norm values that still lie in the green or yellow zones. For example, if you choose Smallest in Yellow Zone, the initially displayed solution is the model in the yellow zone that has the smallest l1 norm. See Comparable Model Zones.
–
|
–
|
In turn, each fold is used as a validation set. A model is fit to the observations not in the fold. The log-likelihood based on that model is calculated for the observations in the fold, providing a validation log-likelihood.
|
–
|
The mean of the validation log-likelihoods for the k folds is calculated. This value serves as a validation log-likelihood for the value of the tuning parameter.
|
The value of the tuning parameter that has the maximum validation log-likelihood is used to construct the final solution. To obtain the final model, all k models derived for the optimal value of the tuning parameter are fit to the entire data set. Of these, the model that has the highest validation log-likelihood is selected as the final model. The training set used for that final model is designated as the Training set and the holdout fold for that model is the Validation set. These are the Training and Validation sets used in plots and in the reported results for the final solution.
Minimizes the Bayesian Information Criterion (BIC) over the solution path. For more details, see Likelihood, AICc, and BIC in Statistical Details.
Minimizes the corrected Akaike Information Criterion (AICc) over the solution path. AICc is the default setting for Validation Method. For more details, see Likelihood, AICc, and BIC in Statistical Details.
Minimizes the Extended Regularized Information Criterion (ERIC) over the solution path. See Model Fit Detail. Available only for exponential family distributions and for the Lasso and adaptive Lasso estimation methods.
When you click Go, a report opens. The title of the report specifies the fitting and validation methods that you selected. You can return to the Model Launch control panel to perform additional analyses and choose other estimation and validation methods.