Question on extracting subsampled features from node in Random forest Package

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Question on extracting subsampled features from node in Random forest Package

Samir Rachid Zaim
Hi all,

------------------------------------------------------------------------------------------------------------
*Question:*
*Is there a way to see what variables are subsampled in a node in a tree in
a random forest?*
------------------------------------------------------------------------------------------------------------
I posted this question on cross-validated *(see here)
<https://stats.stackexchange.com/questions/419872/is-there-a-way-to-see-what-variables-are-subsampled-in-a-node-in-a-tree-in-a-ran>*,
but they closed it as "off-topic", so I'll rewrite and hopefully get some
feedback there. I'd figure I'd also try here.

In a random forest, the *mtry* parameter determines what
percentage*/*proportion
of features gets subsampled at each node in a tree in a *randomForest*
classifier. At the moment, if I search through the randomForest object, I
can get a tree map that shows you what feature was chosen, and what was the
splitting value for that node.

My question, is, is there a way to also see what features were subsampled
for that node?
If not, is than an option for a future release?

The example below shows what's currently available when you scan through
the forest in the rf.object, but it seems to only include the final
variable and splitting value, rather than the entire set of features at
each node.
------------------------------------------------------------------------------------------------------------
*Example:*

X = matrix(rnorm(1000), ncol=10)

beta= rep(1,10)

z = X %*% beta

y = factor(rbinom(100, size=1, prob= 1/(1 + exp(-z))))

rf.object <- randomForest::randomForest(X,y, keep.forest=T)

str(rf.object$forest)
------------------------------------------------------------------------------------------------------------

I appreciate any and all help. Thanks!!

--
Samir Rachid Zaim,
PhD Student in Statistics and Data Science,
Data Science Ambassador, College of Medicine
University of Arizona Bioscience Research Lab
1230 N Cherry Ave,
Tucson, AZ 85719

email: [hidden email]
website: https://samirrachidzaim.github.io/

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.