randomforests - how to classify

classic Classic list List threaded Threaded
2 messages Options
pdb
Reply | Threaded
Open this post in threaded view
|

randomforests - how to classify

pdb
Hi,

I'm experimenting with random forests and want to perform a binary classification task.
I've tried some of the sample codes in the help files and things run, but I get a message to the effect 'you don't have very many unique values in the target - are you sure you want to do regression?' (sorry, don't know exact message but r is busy now so can't check).


In reading the help files I see 2 examples, one for classification and one for regression. To the uninformed - these don't seem much different to each other. How does rf know to do regression or classification?

## Classification:
##data(iris)
set.seed(71)
iris.rf <- randomForest(Species ~ ., data=iris, importance=TRUE,
                        proximity=TRUE)


## Regression:
## data(airquality)
set.seed(131)
ozone.rf <- randomForest(Ozone ~ ., data=airquality, mtry=3,
                         importance=TRUE, na.action=na.omit)


My target variable only has 2 values - why does it want to do regression? I've entered code just like that in the classification example above. Also when it asks me 'are you sure you want to do regression' - how do I say 'NO, do classification please'?



Reply | Threaded
Open this post in threaded view
|

Re: randomforests - how to classify

Changbin Du
use (as.factor(target) ~., data =your data, ...)





On Tue, May 4, 2010 at 12:07 PM, pdb <[hidden email]> wrote:

>
> Hi,
>
> I'm experimenting with random forests and want to perform a binary
> classification task.
> I've tried some of the sample codes in the help files and things run, but I
> get a message to the effect 'you don't have very many unique values in the
> target - are you sure you want to do regression?' (sorry, don't know exact
> message but r is busy now so can't check).
>
>
> In reading the help files I see 2 examples, one for classification and one
> for regression. To the uninformed - these don't seem much different to each
> other. How does rf know to do regression or classification?
>
> ## Classification:
> ##data(iris)
> set.seed(71)
> iris.rf <- randomForest(Species ~ ., data=iris, importance=TRUE,
>                        proximity=TRUE)
>
>
> ## Regression:
> ## data(airquality)
> set.seed(131)
> ozone.rf <- randomForest(Ozone ~ ., data=airquality, mtry=3,
>                         importance=TRUE, na.action=na.omit)
>
>
> My target variable only has 2 values - why does it want to do regression?
> I've entered code just like that in the classification example above. Also
> when it asks me 'are you sure you want to do regression' - how do I say
> 'NO,
> do classification please'?
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/randomforests-how-to-classify-tp2126166p2126166.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Sincerely,
Changbin
--

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.