Publication date: 07/08/2024

Image shown herePer Tree Summaries

In the Bootstrap Forest platform, the Per Tree Summaries report involves the concepts of in-bag and out-of-bag observations. For an individual tree, the bootstrap sample of observations used in fitting the tree is drawn with replacement. Even if you specify that 100% of the observations are to be sampled, because they are drawn with replacement, the expected proportion of unused observations is 1/e. For each individual tree, the unused observations are called the out-of-bag observations. The observations used in fitting the tree are called in-bag observations. Use the summaries to evaluate the impact of the sampling methodology on the trees. One would expect the summary values to be similar across the individual trees.

The Per-Tree Summaries report shows the following summary statistics for each tree:

Splits

The number of splits in the decision tree.

Rank

The rank of the tree’s OOB Loss/N in ascending order. The tree with the smallest OOB Loss/N has Rank 1.

OOB Loss

A measure of the predictive inaccuracy of the tree when applied to the Out Of Bag rows prior to tree pruning. Trees continue to split until they reach the maximum specified size or until the stopping criteria stops improving. If the splitting stops due to the stopping criteria failing to improve, then the tree is pruned back one level to obtain the final tree. Lower values indicate a higher predictive accuracy.

OOB Loss/N

The OOB Loss divided by the number of OOB rows, OOB N.

RSquare

(Available only for continuous responses.) The RSquare value for the tree.

IB SSE

(Available only for continuous responses.) Sum of squared errors for the In Bag rows.

IB SSE/N

(Available only for continuous responses.) Sum of squared errors for the In Bag rows divided by the number of In Bag observations. The number of In Bag observations is equal to the number of observations in the training set multiplied by the bootstrap sampling rate that you specify in the Bootstrap Forest Specification window.

OOB N

(Available only for continuous responses.) The number of Out Of Bag rows.

OOB SSE

(Available only for continuous responses.) Sum of squared errors when the final tree is applied to the Out Of Bag rows.

OOB SSE/N

(Available only for continuous responses.) The OOB SSE divided by the number of OOB rows, OOB N.

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).