JMP 13.2 Online Documentation (English)
Discovering JMP
Using JMP
Basic Analysis
Essential Graphing
Profilers
Design of Experiments Guide
Fitting Linear Models
Predictive and Specialized Modeling
Multivariate Methods
Quality and Process Methods
Reliability and Survival Methods
Consumer Research
Scripting Guide
JSL Syntax Reference
JMP iPad Help
JMP Interactive HTML
Capabilities Index
JMP 12 Online Documentation
Predictive and Specialized Modeling
•
Modeling Utilities
•
Make Validation Column Utility
• Launch the Make Validation Column Utility
Previous
•
Next
Launch the Make Validation Column Utility
You can launch the Make Validation Column utility in two ways:
•
Select Analyze > Predictive Modeling > Make Validation Column. See
Make Validation Column Window
.
•
Click Validation in a platform launch window. See
Click Validation in a Platform Launch
.
Make Validation Column Window
In the Make Validation Column window, you specify the proportion or number of rows for each of your holdback sets and then you select a method for constructing the holdback sets.
Make Validation Column Window
•
Next to Training Set, Validation Set, and Test Set, enter values that represent the proportions or numbers of rows that you would like to include in each of these sets. The default values construct a training set that contains about 75% of the rows and a validation set that contains about 25% of the rows.
•
Enter a name for your validation column next to New Column Name.
There are five methods available to create the holdback sets.
Formula Random
Partitions the data into sets based on the allocations entered. For example, if the default values are entered, each row has a probability of 0.75 to be included in the training set and 0.25 probability of being included in the validation set. The formula is saved to the column. To see it, click on the plus icon to the right of the column name in the Columns panel.
Fixed Random
Partitions the data into sets based on the allocations entered. For example, if the default values are entered, each row has a probability of 0.75 to be included in the training set and 0.25 probability of being included in the validation set. You can specify a random seed that enables you to reproduce the allocations in the future. No formula is saved to the column.
Stratified Random
Partitions the data into balanced sets based on levels of columns that you specify. Use this option when you want a balanced representation of a column’s levels in each of the training, validation, and testing sets.
When you click Stratified Random, a window appears that enables you to select one or more columns by which to stratify the data. When you click OK, the validation column is added to the data table. As in the Fixed Random case, rows are randomly assigned to the holdback sets based on the specified allocations. However, this is done at each level or combination of levels of the stratifying columns.
A column is added to the data table with a Notes property that gives the stratifying variables.
Grouped Random
Partitions the data into sets in such a way that entire levels of a specified column or combinations of levels of two or more columns are placed in the same holdback set. Use this option when splitting levels across holdback sets is not desirable.
When you click Grouped Random, a window appears that enables you to select one or more columns to be grouping columns. When you click OK, the levels are randomly assigned to holdback sets. When a level is larger than the proportion or number of rows you specify, it stays in its assigned holdback set. However, fewer rows are allocated into the training set. Because of this, the sizes of the resulting sets vary slightly from the sizes that you specified.
Cutpoint
Partitions the data into sets based on time series cutpoints. Use this option when you want to assign your data to holdback sets based on time periods.
When you click Cutpoint, a window appears that enables you to select one or more columns to define time periods. When you click OK, a JMP Alert appears that shows the assigned cutpoints. A column that reflects this assignment is added to the data table. The training set consists of rows between the first cutpoint and the second cutpoint. The validation set consists of rows between the second and third cutpoints. The test set consists of the remaining rows. These sets are chosen to reflect the proportions or numbers of rows that you specified.
Click Validation in a Platform Launch
Use this method if you are in a platform launch window and need to construct a validation column quickly. Note the following:
•
The platform must support a Validation column.
•
No columns must be selected in the Select Columns list.
Click the Validation button in the platform launch window. A Make Validation Column window appears with default settings of 0.7 for the Training Set, 0.3 for the Validation Set, and 0.0 for the Test Set.
1.
Enter your desired proportions or numbers next to Training Set, Validation Set, and Test Set.
2.
Type a name for the new column next to New Column Name.
3.
Click OK.
The new column appears in the data table with a formula. In the launch window, the new column is assigned to the Validation role.
Note:
Launching the Make Validation Column utility through a platform launch window is equivalent to selecting the Formula Random method from Analyze > Predictive Modeling > Make Validation Column.The Fixed Random, Stratified Random, Grouped Random, and Cutpoint methods are not available.
Previous
•
Next
Help created on 9/19/2017