This isn't related to finance. Part of the reason for separate lists

may be best found.

> Momop, I think that would warp the robustness of RF. As I understand it, RF

> averages together the different leaves which are themselves averages.

> Pruning like you're talking about would risk overfitting to your particular

> dataset rather than the data-generating process.

>

> On Sat, Nov 26, 2011 at 6:52 PM, Momop Momop <

[hidden email]> wrote:

>

>>

>> Apologies as the mail got sent before completion. Here's the full text

>>

>> I am learning Random Forest and have a basic training question. For my

>> problem, I "derived" various classifiers (var0,var1...var9). They are

>> independent, but the intrinsic values from which they are derived overlap.

>> I get the following data for my RF tree. The question I have is, should I

>> eliminate the number of classifiers that haven't shown enough importance

>> (For example, I could scale %IncMSE relatively and may be just pick the top

>> 3 or 4).

>>

>> -------------------------------

>> %IncMSE IncNodePurity

>> Var0 10.84632 7.232559

>> var1 24.53021 7.976509

>> var2 26.5005 4.653162

>> var3 60.18863 21.882258

>> var4 11.97568 7.25413

>> var5 49.63468 16.968472

>> var6 19.55981 10.009517

>> var7 10.36669 13.136694

>> var8 14.16585 7.818673

>> var9 9.75812 7.178831

>> -------------------------------

>>

>> Essentially, what I was attempting to do was to choose the best derived

>> classifier by eliminating some from the above list which doesn't show

>> noticeable relative impact on MSE. Any guidance or pointers is much

>> appreciated. Thanks!

>>

>>

>> ________________________________

>>

>> To: "

[hidden email]" <

[hidden email]>

>> Sent: Saturday, November 26, 2011 5:45 PM

>> Subject: [R-SIG-Finance] Random Forest Classifiers

>>

>> I am learning Random Forest and have a basic training question. For my

>> problem, I "derived" various classifiers (var0,var1...var9). They are

>> independent, but the intrinsic values from which they are derived overlap.

>> I get the following data for my RF tree. The question I have is, should I

>> eliminate the number of classifiers that haven't shown enough importance

>> (For example, I could scale %IncMSE relatively and may be just pick the top

>> 3 or 4).

>>

>> -------------------------------

>> %IncMSE IncNodePurity

>> Var0 10.84632 7.232559

>> var1 24.53021 7.976509

>> var2 26.5005 4.653162

>> var3 60.18863 21.882258

>> var4 11.97568 7.25413

>> var5 49.63468 16.968472

>> var6 19.55981 10.009517

>> var7 10.36669 13.136694

>> var8 14.16585 7.818673

>> var9 9.75812 7.178831

>> -------------------------------

>>

>> [[elided Yahoo spam]]

>> [[alternative HTML version deleted]]

>>

>> _______________________________________________

>>

[hidden email] mailing list

>>

https://stat.ethz.ch/mailman/listinfo/r-sig-finance>> -- Subscriber-posting only. If you want to post, subscribe first.

>> -- Also note that this is not the r-help list where general R questions

>> should go.

>> [[alternative HTML version deleted]]

>>

>>

>> _______________________________________________

>>

[hidden email] mailing list

>>

https://stat.ethz.ch/mailman/listinfo/r-sig-finance>> -- Subscriber-posting only. If you want to post, subscribe first.

>> -- Also note that this is not the r-help list where general R questions

>> should go.

>>

>

> [[alternative HTML version deleted]]

>

> _______________________________________________

>

[hidden email] mailing list

>

https://stat.ethz.ch/mailman/listinfo/r-sig-finance> -- Subscriber-posting only. If you want to post, subscribe first.

> -- Also note that this is not the r-help list where general R questions should go.

>

-- Subscriber-posting only. If you want to post, subscribe first.

-- Also note that this is not the r-help list where general R questions should go.