Multivariate random forests in R - how to obtain variance explained for multiple responses in randomForestSRC package - or other package
I am wanting to use MRF to do multivariate regression.
We are testing whether acoustic indices can predict structure (relative abundances) of vocalising avian community in UK and Ecuador.
There are 26 acoustic indices, 65 UK species and 95 Ecuadorian species.
I want to build a model for each ecozone (UK/ EC) using all species (relative abundance) as response matrix, and acoustic indices as predictors.
I’d then like to know:
- total variance explained & error
- variable importance (stable rank at least)
- proximity matrix
The interface for package randomForestSRC looks hopeful. You can specify a MRF like this:
Sample size: 1984
Number of trees: 1000
Minimum terminal node size: 5
Average no. of terminal nodes: 506.754
No. of variables tried at each split: 9
Total no. of variables: 26
Total no. of responses: 65
User has requested response: UN
Splitting rule: mv.mse
% variance explained: 72.46
Error rate: 0.02
But I can’t see how to request the output (% variance explained and error) for all 65 responses — only one response at a time (here UN)
If anyone has any experience of this package — or how to achieve this in other packages I’d love to hear from you.
Hopefully I am missing something obvious.
Dr Alice Eldridge
Sussex Humanities Lab,
School of Media, Film and Music
University of Sussex