How to calculate the generalization error of random forests?
Perhaps this is not the proper place to ask this
question but I am out of options, therefore I
apologize in advance.
I want to know how the (upper bound?) generalization
error of the random forest is determined using the
out-of-bag estimate. I read in Breiman's paper that s
and p determine the generalization error:
Does s stands for the strength of the individual tree
or of the entire ensemble? p stands for the
correlation between the trees.
If I have, let's say, built 3 trees in my forest and I
know for each tree the instances that were left out
during training, how do I calculate s and p, so I can
calculate the error?