Ridge regression is a form of regularized regression that allows for numerous, potentially correlated,
predictors and shrinks them using a common
variance component model. The process computes Best Linear Unbiased Predictions (BLUPs) of the responses based on this
mixed model. Computations are performed using SAS/STAT PROC MIXED.
As always, it is not easy to tell beforehand which predictive model best fits your data. You should, therefore, plan to run your data through several, if not all, of the predictive models to find out which
model works best. The
Cross Validation Model Comparison process is especially useful for this task. See
Cross Validation Model Comparison for more details.
The adsl_dii.sas7bdat data set, used in the following example, consists of 906 rows of individuals with 382 columns corresponding to data on these individuals. It was generated from the original nicardipine ADSL data set described in Nicardipine and is included with JMP Clinical
. This data set is partially shown below.