Publication date: 07/08/2024

Node Reports

Each node in the Partition report tree has a report and a red triangle menu with additional options. See Node Options. Terminal nodes also have a Candidates report.

Count

Number of training observations that are characterized by the node.

Mean

(Available only for continuous responses.) The average response for all observations in that node.

Std Dev

(Available only for continuous responses.) The standard deviation of the response for all observations in that node.

G2

(Available only for categorical responses.) A fit statistic used for categorical responses (instead of sum of squares that is used for continuous responses). Lower values indicate a better fit. See Statistical Details for the Partition Platform.

Logworth

The Logworth statistic, defined as -log10(p-value). The optimal split is the one that maximizes the logworth. See Statistical Details for the Partition Platform

Difference

(Available only for continuous responses.) The difference in the mean of the response between the two nodes.

Candidates

For each column, the Candidates report provides details about the optimal split for that column. The optimal split over all terms is marked with an asterisk.

Term

Shows the candidate columns.

Candidate G^2

(Available only for categorical responses.) Likelihood ratio chi-square for the best split. Splitting on the predictor with the largest G^2 maximizes the reduction in the model G^2.

Candidate SS

(Available only for continuous responses.) Sum of squares for the best split.

Logworth

The Logworth statistic, defined as -log10(p-value). The optimal split is the one that maximizes the logworth. See Statistical Details for the Partition Platform.

Cut Point

The value of the predictor that determines the split. For a categorical term, the levels in the left-most split are listed.

The optimal split is noted by an asterisk. However, there are cases where the Candidate G2 or the Candidate SS is higher for one variable, but the logworth is higher for a different variable. In this case > and < are used to point in the best direction for each variable. The asterisk corresponds to the condition where they agree. See Statistical Details for the Partition Platform.

Node Options

This section describes the options in the red triangle menu for each node.

Split Best

Finds and executes the best split at or below this node.

Split Here

Splits at the selected node on the best column to split by.

Split Specific

Lets you specify where a split takes place. This is useful in showing what the criterion is as a function of the cut point, as well as in determining custom cut points. When specifying a splitting column, you can choose the following options for how the split is performed:

Optimal Value

Splits at the optimal value of the selected variable.

Specified Value

Enables you to specify the level where the split takes place.

Output Split Table

Produces a data table showing all possible splits and their associated split value.

Prune Below

Eliminates the splits below the selected node.

Prune Worst

Finds and removes the worst split below the selected node.

Select Rows

Selects the data table rows corresponding to this leaf. You can extend the selection by holding down the Shift key and choosing this command from another node.

Show Details

Produces a data table that shows the split criterion for a selected variable. The data table, composed of split intervals and their associated criterion values, has an attached script that produces a graph for the criterion.

Lock

Prevents a node or its subnodes from being chosen for a split. When checked, a lock icon appears in the node title.

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).