|
I'm dealing with classification problems, and I'm trying to specify a
custom scoring metric (recall@p, ROC, etc.) that depends on not just the class output but the probability estimates, so that caret::train can choose the optimal tuning parameters based on this metric. However, when I supply a trainControl summaryFunction, the data given to it contains only class predictions, so the only metrics possible are things like accuracy, kappa, etc. Is there any way to do this that I'm looking? If not, could I put this in as a feature request? Thanks! -- Yang Zhang http://yz.mit.edu/ ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Oops, found trainControl's classProbs right after I sent!
On Thu, Feb 9, 2012 at 4:30 PM, Yang Zhang <[hidden email]> wrote: > I'm dealing with classification problems, and I'm trying to specify a > custom scoring metric (recall@p, ROC, etc.) that depends on not just > the class output but the probability estimates, so that caret::train > can choose the optimal tuning parameters based on this metric. > > However, when I supply a trainControl summaryFunction, the data given > to it contains only class predictions, so the only metrics possible > are things like accuracy, kappa, etc. > > Is there any way to do this that I'm looking? If not, could I put > this in as a feature request? Thanks! > > -- > Yang Zhang > http://yz.mit.edu/ -- Yang Zhang http://yz.mit.edu/ ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Actually, is there any way to get at additional information beyond the
classProbs? In particular, is there any way to find out the associated weights, or otherwise the row indices into the original model matrix corresponding to the tested instances? On Thu, Feb 9, 2012 at 4:37 PM, Yang Zhang <[hidden email]> wrote: > Oops, found trainControl's classProbs right after I sent! > > On Thu, Feb 9, 2012 at 4:30 PM, Yang Zhang <[hidden email]> wrote: >> I'm dealing with classification problems, and I'm trying to specify a >> custom scoring metric (recall@p, ROC, etc.) that depends on not just >> the class output but the probability estimates, so that caret::train >> can choose the optimal tuning parameters based on this metric. >> >> However, when I supply a trainControl summaryFunction, the data given >> to it contains only class predictions, so the only metrics possible >> are things like accuracy, kappa, etc. >> >> Is there any way to do this that I'm looking? If not, could I put >> this in as a feature request? Thanks! >> >> -- >> Yang Zhang >> http://yz.mit.edu/ > > > > -- > Yang Zhang > http://yz.mit.edu/ -- Yang Zhang http://yz.mit.edu/ ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
I think you need to read the man pages and the four vignettes. A lot
of your questions have answers there. If you don't specify the resampling indices, they ones generated for you are saved in the train object: > data(iris) > TrainData <- iris[,1:4] > TrainClasses <- iris[,5] > > knnFit1 <- train(TrainData, TrainClasses, + method = "knn", + preProcess = c("center", "scale"), + tuneLength = 10, + trControl = trainControl(method = "cv")) Loading required package: class Attaching package: ‘class’ The following object(s) are masked from ‘package:reshape’: condense Warning message: executing %dopar% sequentially: no parallel backend registered > str(knnFit1$control$index) List of 10 $ Fold01: int [1:135] 1 2 3 4 5 6 7 9 10 11 ... $ Fold02: int [1:135] 1 2 3 4 5 6 8 9 10 12 ... $ Fold03: int [1:135] 1 3 4 5 6 7 8 9 10 11 ... $ Fold04: int [1:135] 1 2 3 5 6 7 8 9 10 11 ... $ Fold05: int [1:135] 1 2 3 4 6 7 8 9 11 12 ... $ Fold06: int [1:135] 1 2 3 4 5 6 7 8 9 10 ... $ Fold07: int [1:135] 1 2 3 4 5 7 8 9 10 11 ... $ Fold08: int [1:135] 2 3 4 5 6 7 8 9 10 11 ... $ Fold09: int [1:135] 1 2 3 4 5 6 7 8 9 10 ... $ Fold10: int [1:135] 1 2 4 5 6 7 8 10 11 12 ... There is also a savePredictions argument that gives you the hold-out results. I'm not sure which weights you are referring to. On Fri, Feb 10, 2012 at 4:38 AM, Yang Zhang <[hidden email]> wrote: > Actually, is there any way to get at additional information beyond the > classProbs? In particular, is there any way to find out the > associated weights, or otherwise the row indices into the original > model matrix corresponding to the tested instances? > > On Thu, Feb 9, 2012 at 4:37 PM, Yang Zhang <[hidden email]> wrote: >> Oops, found trainControl's classProbs right after I sent! >> >> On Thu, Feb 9, 2012 at 4:30 PM, Yang Zhang <[hidden email]> wrote: >>> I'm dealing with classification problems, and I'm trying to specify a >>> custom scoring metric (recall@p, ROC, etc.) that depends on not just >>> the class output but the probability estimates, so that caret::train >>> can choose the optimal tuning parameters based on this metric. >>> >>> However, when I supply a trainControl summaryFunction, the data given >>> to it contains only class predictions, so the only metrics possible >>> are things like accuracy, kappa, etc. >>> >>> Is there any way to do this that I'm looking? If not, could I put >>> this in as a feature request? Thanks! >>> >>> -- >>> Yang Zhang >>> http://yz.mit.edu/ >> >> >> >> -- >> Yang Zhang >> http://yz.mit.edu/ > > > > -- > Yang Zhang > http://yz.mit.edu/ > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Max ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Sorry for not being more clear - I'm interested in accessing these
indices from within the trainControl summaryFunction, not afterward (from the train object). As for the weights, I'm referring to the weights argument passed into train. On Fri, Feb 10, 2012 at 5:50 AM, Max Kuhn <[hidden email]> wrote: > I think you need to read the man pages and the four vignettes. A lot > of your questions have answers there. > > If you don't specify the resampling indices, they ones generated for > you are saved in the train object: > >> data(iris) >> TrainData <- iris[,1:4] >> TrainClasses <- iris[,5] >> >> knnFit1 <- train(TrainData, TrainClasses, > + method = "knn", > + preProcess = c("center", "scale"), > + tuneLength = 10, > + trControl = trainControl(method = "cv")) > Loading required package: class > > Attaching package: ‘class’ > > The following object(s) are masked from ‘package:reshape’: > > condense > > Warning message: > executing %dopar% sequentially: no parallel backend registered >> str(knnFit1$control$index) > List of 10 > $ Fold01: int [1:135] 1 2 3 4 5 6 7 9 10 11 ... > $ Fold02: int [1:135] 1 2 3 4 5 6 8 9 10 12 ... > $ Fold03: int [1:135] 1 3 4 5 6 7 8 9 10 11 ... > $ Fold04: int [1:135] 1 2 3 5 6 7 8 9 10 11 ... > $ Fold05: int [1:135] 1 2 3 4 6 7 8 9 11 12 ... > $ Fold06: int [1:135] 1 2 3 4 5 6 7 8 9 10 ... > $ Fold07: int [1:135] 1 2 3 4 5 7 8 9 10 11 ... > $ Fold08: int [1:135] 2 3 4 5 6 7 8 9 10 11 ... > $ Fold09: int [1:135] 1 2 3 4 5 6 7 8 9 10 ... > $ Fold10: int [1:135] 1 2 4 5 6 7 8 10 11 12 ... > > There is also a savePredictions argument that gives you the hold-out results. > > I'm not sure which weights you are referring to. > > On Fri, Feb 10, 2012 at 4:38 AM, Yang Zhang <[hidden email]> wrote: >> Actually, is there any way to get at additional information beyond the >> classProbs? In particular, is there any way to find out the >> associated weights, or otherwise the row indices into the original >> model matrix corresponding to the tested instances? >> >> On Thu, Feb 9, 2012 at 4:37 PM, Yang Zhang <[hidden email]> wrote: >>> Oops, found trainControl's classProbs right after I sent! >>> >>> On Thu, Feb 9, 2012 at 4:30 PM, Yang Zhang <[hidden email]> wrote: >>>> I'm dealing with classification problems, and I'm trying to specify a >>>> custom scoring metric (recall@p, ROC, etc.) that depends on not just >>>> the class output but the probability estimates, so that caret::train >>>> can choose the optimal tuning parameters based on this metric. >>>> >>>> However, when I supply a trainControl summaryFunction, the data given >>>> to it contains only class predictions, so the only metrics possible >>>> are things like accuracy, kappa, etc. >>>> >>>> Is there any way to do this that I'm looking? If not, could I put >>>> this in as a feature request? Thanks! >>>> >>>> -- >>>> Yang Zhang >>>> http://yz.mit.edu/ >>> >>> >>> >>> -- >>> Yang Zhang >>> http://yz.mit.edu/ >> >> >> >> -- >> Yang Zhang >> http://yz.mit.edu/ >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Max -- Yang Zhang http://yz.mit.edu/ ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
(I couldn't find answers to this question in the documentation)
On Fri, Feb 10, 2012 at 11:59 AM, Yang Zhang <[hidden email]> wrote: > Sorry for not being more clear - I'm interested in accessing these > indices from within the trainControl summaryFunction, not afterward > (from the train object). > > As for the weights, I'm referring to the weights argument passed into > train. > > On Fri, Feb 10, 2012 at 5:50 AM, Max Kuhn <[hidden email]> wrote: >> I think you need to read the man pages and the four vignettes. A lot >> of your questions have answers there. >> >> If you don't specify the resampling indices, they ones generated for >> you are saved in the train object: >> >>> data(iris) >>> TrainData <- iris[,1:4] >>> TrainClasses <- iris[,5] >>> >>> knnFit1 <- train(TrainData, TrainClasses, >> + method = "knn", >> + preProcess = c("center", "scale"), >> + tuneLength = 10, >> + trControl = trainControl(method = "cv")) >> Loading required package: class >> >> Attaching package: ‘class’ >> >> The following object(s) are masked from ‘package:reshape’: >> >> condense >> >> Warning message: >> executing %dopar% sequentially: no parallel backend registered >>> str(knnFit1$control$index) >> List of 10 >> $ Fold01: int [1:135] 1 2 3 4 5 6 7 9 10 11 ... >> $ Fold02: int [1:135] 1 2 3 4 5 6 8 9 10 12 ... >> $ Fold03: int [1:135] 1 3 4 5 6 7 8 9 10 11 ... >> $ Fold04: int [1:135] 1 2 3 5 6 7 8 9 10 11 ... >> $ Fold05: int [1:135] 1 2 3 4 6 7 8 9 11 12 ... >> $ Fold06: int [1:135] 1 2 3 4 5 6 7 8 9 10 ... >> $ Fold07: int [1:135] 1 2 3 4 5 7 8 9 10 11 ... >> $ Fold08: int [1:135] 2 3 4 5 6 7 8 9 10 11 ... >> $ Fold09: int [1:135] 1 2 3 4 5 6 7 8 9 10 ... >> $ Fold10: int [1:135] 1 2 4 5 6 7 8 10 11 12 ... >> >> There is also a savePredictions argument that gives you the hold-out results. >> >> I'm not sure which weights you are referring to. >> >> On Fri, Feb 10, 2012 at 4:38 AM, Yang Zhang <[hidden email]> wrote: >>> Actually, is there any way to get at additional information beyond the >>> classProbs? In particular, is there any way to find out the >>> associated weights, or otherwise the row indices into the original >>> model matrix corresponding to the tested instances? >>> >>> On Thu, Feb 9, 2012 at 4:37 PM, Yang Zhang <[hidden email]> wrote: >>>> Oops, found trainControl's classProbs right after I sent! >>>> >>>> On Thu, Feb 9, 2012 at 4:30 PM, Yang Zhang <[hidden email]> wrote: >>>>> I'm dealing with classification problems, and I'm trying to specify a >>>>> custom scoring metric (recall@p, ROC, etc.) that depends on not just >>>>> the class output but the probability estimates, so that caret::train >>>>> can choose the optimal tuning parameters based on this metric. >>>>> >>>>> However, when I supply a trainControl summaryFunction, the data given >>>>> to it contains only class predictions, so the only metrics possible >>>>> are things like accuracy, kappa, etc. >>>>> >>>>> Is there any way to do this that I'm looking? If not, could I put >>>>> this in as a feature request? Thanks! >>>>> >>>>> -- >>>>> Yang Zhang >>>>> http://yz.mit.edu/ >>>> >>>> >>>> >>>> -- >>>> Yang Zhang >>>> http://yz.mit.edu/ >>> >>> >>> >>> -- >>> Yang Zhang >>> http://yz.mit.edu/ >>> >>> ______________________________________________ >>> [hidden email] mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> >> >> -- >> >> Max > > > > -- > Yang Zhang > http://yz.mit.edu/ -- Yang Zhang http://yz.mit.edu/ ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
| Powered by Nabble | Edit this page |
