The Lipid Data.jmp data table contains blood measurements, physical measurements, and questionnaire data from 95 subjects at a California hospital. You are interested in using a validation column as a way of validation during future analyses.
1.
|
2.
|
Select Analyze > Distribution.
|
3.
|
Figure 2.15 Distribution of Gender in Lipid Data.jmp
Figure 2.15 illustrates the distribution of Gender in the data set. Notice that there is not an equal proportion of males and females represented. Because there is a scarcity of females within the data, you want to be sure to balance the genders across the validation and training sets.
4.
|
Select Analyze > Predictive Modeling > Make Validation Column.
|
5.
|
Click Stratified Random.
|
6.
|
Select Gender as the column used for validation holdback.
|
7.
|
Click OK.
|
8.
|
Select Analyze > Fit Y by X.
|
9.
|
10.
|
Click OK.
|
Figure 2.16 illustrates the distribution of Gender across each of the validation and training sets. Note that about 75% of both females and males are in the training set and about 25% of both females and males are in the validation set.