Using bartMachine with the caret package

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Using bartMachine with the caret package

Patrick Connolly-4
Dave Langer in this video https://www.youtube.com/watch?v=z8PRU46I3NY
uses the titanic data as an example of using caret to create xgbTree
models.  The caret train() function has a tuneGrid parameter which
takes a list set up like so:

tune.grid <- expand.grid(eta = c(0.05, 0.075, 0.1),
                         nrounds = c(50, 75, 100),
                         max_depth = 6:8,
                         min_child_weight = c(2, 2.25, 2.5),
                         colsample_bytree = (3:5)/10,
                         gamma = 0, subsample = 1)

That approach also worked with my data.  By making the corresponding
adjustments, I was also successful with gbm, bstTree and extraTree
models but I can't get it to work with bartMachine models. I get
dozens of messages like these:

bartMachine initializing with 50 trees...
bartMachine vars checked...
bartMachine java init...
bartMachine factors created...
bartMachine before preprocess...
bartMachine after preprocess... 19 total features...
bartMachine sigsq estimated...
bartMachine initializing with 30 trees...
bartMachine vars checked...
bartMachine java init...
bartMachine factors created...
bartMachine before preprocess...
bartMachine after preprocess... 19 total features...
bartMachine sigsq estimated...

[...]

And eventually,  this:

Something is wrong; all the RMSE metric values are missing:
      RMSE        Rsquared        MAE    
 Min.   : NA   Min.   : NA   Min.   : NA  
 1st Qu.: NA   1st Qu.: NA   1st Qu.: NA  
 Median : NA   Median : NA   Median : NA  
 Mean   :NaN   Mean   :NaN   Mean   :NaN  
 3rd Qu.: NA   3rd Qu.: NA   3rd Qu.: NA  
 Max.   : NA   Max.   : NA   Max.   : NA  
 NA's   :2     NA's   :2     NA's   :2    


If I omit the tuneGrid parameter, I get a model and predictions
comparable to those from the other models.  If I could tune the
parameters I would possibly get better predictions.

Possibly relevant is the fact that formula method of defining the
model seems not to work.  I had to use the method of supplying x and y
specifically.  I couldn't find how to use the 'recipe' method.  All
the specific links in the help files were dead.

Any suggestions welcome.

--
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.  
   ___    Patrick Connolly  
 {~._.~}                   Great minds discuss ideas    
 _( Y )_           Average minds discuss events
(:_~*~_:)                  Small minds discuss people  
 (_)-(_)                        ..... Eleanor Roosevelt
         
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.