boot.stepAIC fails with computed formula

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

boot.stepAIC fails with computed formula

Stephen O'hagan
I'm trying to use boot.stepAIC for feature selection; I need to be able to specify the name of the dependent variable programmatically, but this appear to fail:

In R-Studio with MS R Open 3.4:

library(bootStepAIC)

#Fake data
n<-200

x1 <- runif(n, -3, 3)
x2 <- runif(n, -3, 3)
x3 <- runif(n, -3, 3)
x4 <- runif(n, -3, 3)
x5 <- runif(n, -3, 3)
x6 <- runif(n, -3, 3)
x7 <- runif(n, -3, 3)
x8 <- runif(n, -3, 3)
y1 <- 42+x3 + 2*x6 + 3*x8 + runif(n, -0.5, 0.5)

dat <- data.frame(x1,x2,x3,x4,x5,x6,x7,x8,y1)
#the real data won't have these names...

cn <- names(dat)
trg <- "y1"
xvars <- cn[cn!=trg]

frm1<-as.formula(paste(trg,"~1"))
frm2<-as.formula(paste(trg,"~ 1 + ",paste(xvars,collapse = "+")))

strt=lm(y1~1,dat) # boot.stepAIC Works fine

#strt=do.call("lm",list(frm1,data=dat)) ## boot.stepAIC FAILS ##

#strt=lm(frm1,dat) ## boot.stepAIC FAILS ##

limit<-5


stp=stepAIC(strt,direction='forward',steps=limit,
            scope=list(lower=frm1,upper=frm2))

bst <- boot.stepAIC(strt,dat,B=50,alpha=0.05,direction='forward',steps=limit,
                    scope=list(lower=frm1,upper=frm2))

b1 <- bst$Covariates
ball <- data.frame(b1)
names(ball)=unlist(trg)

Any ideas?

Cheers,
SOH


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: boot.stepAIC fails with computed formula

Bert Gunter-2
Failed?  What was the error message?

Cheers,

Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Aug 22, 2017 at 8:17 AM, Stephen O'hagan
<[hidden email]> wrote:

> I'm trying to use boot.stepAIC for feature selection; I need to be able to specify the name of the dependent variable programmatically, but this appear to fail:
>
> In R-Studio with MS R Open 3.4:
>
> library(bootStepAIC)
>
> #Fake data
> n<-200
>
> x1 <- runif(n, -3, 3)
> x2 <- runif(n, -3, 3)
> x3 <- runif(n, -3, 3)
> x4 <- runif(n, -3, 3)
> x5 <- runif(n, -3, 3)
> x6 <- runif(n, -3, 3)
> x7 <- runif(n, -3, 3)
> x8 <- runif(n, -3, 3)
> y1 <- 42+x3 + 2*x6 + 3*x8 + runif(n, -0.5, 0.5)
>
> dat <- data.frame(x1,x2,x3,x4,x5,x6,x7,x8,y1)
> #the real data won't have these names...
>
> cn <- names(dat)
> trg <- "y1"
> xvars <- cn[cn!=trg]
>
> frm1<-as.formula(paste(trg,"~1"))
> frm2<-as.formula(paste(trg,"~ 1 + ",paste(xvars,collapse = "+")))
>
> strt=lm(y1~1,dat) # boot.stepAIC Works fine
>
> #strt=do.call("lm",list(frm1,data=dat)) ## boot.stepAIC FAILS ##
>
> #strt=lm(frm1,dat) ## boot.stepAIC FAILS ##
>
> limit<-5
>
>
> stp=stepAIC(strt,direction='forward',steps=limit,
>             scope=list(lower=frm1,upper=frm2))
>
> bst <- boot.stepAIC(strt,dat,B=50,alpha=0.05,direction='forward',steps=limit,
>                     scope=list(lower=frm1,upper=frm2))
>
> b1 <- bst$Covariates
> ball <- data.frame(b1)
> names(ball)=unlist(trg)
>
> Any ideas?
>
> Cheers,
> SOH
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: boot.stepAIC fails with computed formula

Stephen O'hagan
The error is "the model fit failed in 50 bootstrap samples
Error: non-character argument"

Cheers,
SOH.

On 22/08/2017 17:52, Bert Gunter wrote:

> Failed?  What was the error message?
>
> Cheers,
>
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Aug 22, 2017 at 8:17 AM, Stephen O'hagan
> <[hidden email]> wrote:
>> I'm trying to use boot.stepAIC for feature selection; I need to be able to specify the name of the dependent variable programmatically, but this appear to fail:
>>
>> In R-Studio with MS R Open 3.4:
>>
>> library(bootStepAIC)
>>
>> #Fake data
>> n<-200
>>
>> x1 <- runif(n, -3, 3)
>> x2 <- runif(n, -3, 3)
>> x3 <- runif(n, -3, 3)
>> x4 <- runif(n, -3, 3)
>> x5 <- runif(n, -3, 3)
>> x6 <- runif(n, -3, 3)
>> x7 <- runif(n, -3, 3)
>> x8 <- runif(n, -3, 3)
>> y1 <- 42+x3 + 2*x6 + 3*x8 + runif(n, -0.5, 0.5)
>>
>> dat <- data.frame(x1,x2,x3,x4,x5,x6,x7,x8,y1)
>> #the real data won't have these names...
>>
>> cn <- names(dat)
>> trg <- "y1"
>> xvars <- cn[cn!=trg]
>>
>> frm1<-as.formula(paste(trg,"~1"))
>> frm2<-as.formula(paste(trg,"~ 1 + ",paste(xvars,collapse = "+")))
>>
>> strt=lm(y1~1,dat) # boot.stepAIC Works fine
>>
>> #strt=do.call("lm",list(frm1,data=dat)) ## boot.stepAIC FAILS ##
>>
>> #strt=lm(frm1,dat) ## boot.stepAIC FAILS ##
>>
>> limit<-5
>>
>>
>> stp=stepAIC(strt,direction='forward',steps=limit,
>>              scope=list(lower=frm1,upper=frm2))
>>
>> bst <- boot.stepAIC(strt,dat,B=50,alpha=0.05,direction='forward',steps=limit,
>>                      scope=list(lower=frm1,upper=frm2))
>>
>> b1 <- bst$Covariates
>> ball <- data.frame(b1)
>> names(ball)=unlist(trg)
>>
>> Any ideas?
>>
>> Cheers,
>> SOH
>>
>>
>>          [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: boot.stepAIC fails with computed formula

Bert Gunter-2
SImplify your call to lm using the "." argument instead of
manipulating formulas.
> strt <- lm(y1 ~ ., data = dat)

and you do not need to explicitly specify the "1+" on the rhs for lm, so

> frm2<-as.formula(paste(trg," ~  ", paste(xvars,collapse = "+")))
works fine, too.

Anyway, doing this gives (but see end of output)"


bst <- boot.stepAIC(strt,data =
dat,B=50,alpha=0.05,direction='forward',steps=limit,
                  scope=list(lower=frm1,upper=frm2))

> bst

Summary of Bootstrapping the 'stepAIC()' procedure for

Call:
lm(formula = y1 ~ ., data = dat)

Bootstrap samples: 50
Direction: forward
Penalty: 2 * df

Covariates selected
   (%)
x1 100
x2 100
x3 100
x4 100
x5 100
x6 100
x7 100
x8 100

Coefficients Sign
   + (%) - (%)
x3   100     0
x6   100     0
x8   100     0
x1    90    10
x2    86    14
x7    62    38
x4    44    56
x5     0   100

Stat Significance
   (%)
x3 100
x6 100
x8 100
x5  34
x2  20
x1  16
x4  10
x7   4


The stepAIC() for the original data-set gave

Call:
lm(formula = y1 ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8, data = dat)

Coefficients:
(Intercept)           x1           x2           x3           x4           x5
  42.008824     0.012304     0.010925     0.976469    -0.005183    -0.021041
         x6           x7           x8
   2.000649     0.004461     3.007071


Stepwise Model Path
Analysis of Deviance Table

Initial Model:
y1 ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8

Final Model:
y1 ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8


  Step Df Deviance Resid. Df Resid. Dev     AIC
1                        191   16.14592 -485.33


HOWEVER, I do not know why your failed calls failed. In view of the
above, it appears to be a bug in the formula interface, so if you do
not get a satisfactory answer here (i.e. I am wrong about this), you
should contact the maintainer.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Aug 22, 2017 at 10:08 AM, Steve O'Hagan
<[hidden email]> wrote:

> The error is "the model fit failed in 50 bootstrap samples
> Error: non-character argument"
>
> Cheers,
> SOH.
>
>
> On 22/08/2017 17:52, Bert Gunter wrote:
>>
>> Failed?  What was the error message?
>>
>> Cheers,
>>
>> Bert
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Tue, Aug 22, 2017 at 8:17 AM, Stephen O'hagan
>> <[hidden email]> wrote:
>>>
>>> I'm trying to use boot.stepAIC for feature selection; I need to be able
>>> to specify the name of the dependent variable programmatically, but this
>>> appear to fail:
>>>
>>> In R-Studio with MS R Open 3.4:
>>>
>>> library(bootStepAIC)
>>>
>>> #Fake data
>>> n<-200
>>>
>>> x1 <- runif(n, -3, 3)
>>> x2 <- runif(n, -3, 3)
>>> x3 <- runif(n, -3, 3)
>>> x4 <- runif(n, -3, 3)
>>> x5 <- runif(n, -3, 3)
>>> x6 <- runif(n, -3, 3)
>>> x7 <- runif(n, -3, 3)
>>> x8 <- runif(n, -3, 3)
>>> y1 <- 42+x3 + 2*x6 + 3*x8 + runif(n, -0.5, 0.5)
>>>
>>> dat <- data.frame(x1,x2,x3,x4,x5,x6,x7,x8,y1)
>>> #the real data won't have these names...
>>>
>>> cn <- names(dat)
>>> trg <- "y1"
>>> xvars <- cn[cn!=trg]
>>>
>>> frm1<-as.formula(paste(trg,"~1"))
>>> frm2<-as.formula(paste(trg,"~ 1 + ",paste(xvars,collapse = "+")))
>>>
>>> strt=lm(y1~1,dat) # boot.stepAIC Works fine
>>>
>>> #strt=do.call("lm",list(frm1,data=dat)) ## boot.stepAIC FAILS ##
>>>
>>> #strt=lm(frm1,dat) ## boot.stepAIC FAILS ##
>>>
>>> limit<-5
>>>
>>>
>>> stp=stepAIC(strt,direction='forward',steps=limit,
>>>              scope=list(lower=frm1,upper=frm2))
>>>
>>> bst <-
>>> boot.stepAIC(strt,dat,B=50,alpha=0.05,direction='forward',steps=limit,
>>>                      scope=list(lower=frm1,upper=frm2))
>>>
>>> b1 <- bst$Covariates
>>> ball <- data.frame(b1)
>>> names(ball)=unlist(trg)
>>>
>>> Any ideas?
>>>
>>> Cheers,
>>> SOH
>>>
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
>
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: boot.stepAIC fails with computed formula

Bert Gunter-2
In reply to this post by Stephen O'hagan
OK, here's the problem. Continuing with your example:

strt1 <- lm(y1 ~1, dat)
strt2 <- lm(frm1,dat)


> strt1

Call:
lm(formula = y1 ~ 1, data = dat)

Coefficients:
(Intercept)
      41.73

> strt2

Call:
lm(formula = frm1, data = dat)

Coefficients:
(Intercept)
      41.73


Note that the formula objects of the lm object are different: strt2
does not evaluate the formula. So presumably boot.step.AIC does no
evaluation and therefore gets confused with the errors you saw. So you
need to get the evaluated formula into the lm object. This can be
done, e.g. via:

> strt2 <- eval(substitute(lm(form,data = dat), list(form = frm1)))

## yielding

> strt2

Call:
lm(formula = y1 ~ 1, data = dat)

Coefficients:
(Intercept)
      41.73

So this looks like it should fix the problem, but alas no, the
boot.stepAIC call still fails with the same error message. Here's why:

> identical(strt$call, strt2$call)
[1] FALSE

So one might rightfully ask, what the heck is going on here?! Further digging:

> str(strt$call)
 language lm(formula = y1 ~ 1, data = dat)

> str(strt2$call)
 language lm(formula = y1 ~ 1, data = dat)

These certainly look identical! -- but of course they're not:

> names(strt$call)
[1] ""        "formula" "data"
> names(strt2$call)
[1] ""        "formula" "data"

So the difference must lie in the formula component, right? ...

> strt$call$formula
y1 ~ 1
> strt2$call$formula
y1 ~ 1

So, thus far, huhh? But..

> class(strt2$call$formula)
[1] "formula"

> class(strt$call$formula)
[1] "call"

So I think therein lies the critical difference that is screwing
things up. NOTE: If I am wrong about this someone **PLEASE** correct
me.

I see no clear workaround for this other than to explicitly avoid
passing a formula in the lm() call with y~1 or y ~ .   I think the
real fix is to make the  boot.stepAIC function smarter in how it
handles its formula argument, and that is above my paygrade (and
degree of interest) . You should probably email the maintainer, who
may not monitor this list. But give it a day or so to give someone
else a chance to correct me if I'm wrong.


HTH.

Cheers,

Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Aug 22, 2017 at 8:17 AM, Stephen O'hagan
<[hidden email]> wrote:

> I'm trying to use boot.stepAIC for feature selection; I need to be able to specify the name of the dependent variable programmatically, but this appear to fail:
>
> In R-Studio with MS R Open 3.4:
>
> library(bootStepAIC)
>
> #Fake data
> n<-200
>
> x1 <- runif(n, -3, 3)
> x2 <- runif(n, -3, 3)
> x3 <- runif(n, -3, 3)
> x4 <- runif(n, -3, 3)
> x5 <- runif(n, -3, 3)
> x6 <- runif(n, -3, 3)
> x7 <- runif(n, -3, 3)
> x8 <- runif(n, -3, 3)
> y1 <- 42+x3 + 2*x6 + 3*x8 + runif(n, -0.5, 0.5)
>
> dat <- data.frame(x1,x2,x3,x4,x5,x6,x7,x8,y1)
> #the real data won't have these names...
>
> cn <- names(dat)
> trg <- "y1"
> xvars <- cn[cn!=trg]
>
> frm1<-as.formula(paste(trg,"~1"))
> frm2<-as.formula(paste(trg,"~ 1 + ",paste(xvars,collapse = "+")))
>
> strt=lm(y1~1,dat) # boot.stepAIC Works fine
>
> #strt=do.call("lm",list(frm1,data=dat)) ## boot.stepAIC FAILS ##
>
> #strt=lm(frm1,dat) ## boot.stepAIC FAILS ##
>
> limit<-5
>
>
> stp=stepAIC(strt,direction='forward',steps=limit,
>             scope=list(lower=frm1,upper=frm2))
>
> bst <- boot.stepAIC(strt,dat,B=50,alpha=0.05,direction='forward',steps=limit,
>                     scope=list(lower=frm1,upper=frm2))
>
> b1 <- bst$Covariates
> ball <- data.frame(b1)
> names(ball)=unlist(trg)
>
> Any ideas?
>
> Cheers,
> SOH
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: boot.stepAIC fails with computed formula

Stephen O'hagan
Until I get a fix that works, a work-around would be to rename the 'y1' column, used a fixed formula, and rename it back afterwards.

Thanks for your help.
SGO.

-----Original Message-----
From: Bert Gunter [mailto:[hidden email]]
Sent: 22 August 2017 20:38
To: Stephen O'hagan <[hidden email]>
Cc: [hidden email]
Subject: Re: [R] boot.stepAIC fails with computed formula

OK, here's the problem. Continuing with your example:

strt1 <- lm(y1 ~1, dat)
strt2 <- lm(frm1,dat)


> strt1

Call:
lm(formula = y1 ~ 1, data = dat)

Coefficients:
(Intercept)
      41.73

> strt2

Call:
lm(formula = frm1, data = dat)

Coefficients:
(Intercept)
      41.73


Note that the formula objects of the lm object are different: strt2 does not evaluate the formula. So presumably boot.step.AIC does no evaluation and therefore gets confused with the errors you saw. So you need to get the evaluated formula into the lm object. This can be done, e.g. via:

> strt2 <- eval(substitute(lm(form,data = dat), list(form = frm1)))

## yielding

> strt2

Call:
lm(formula = y1 ~ 1, data = dat)

Coefficients:
(Intercept)
      41.73

So this looks like it should fix the problem, but alas no, the boot.stepAIC call still fails with the same error message. Here's why:

> identical(strt$call, strt2$call)
[1] FALSE

So one might rightfully ask, what the heck is going on here?! Further digging:

> str(strt$call)
 language lm(formula = y1 ~ 1, data = dat)

> str(strt2$call)
 language lm(formula = y1 ~ 1, data = dat)

These certainly look identical! -- but of course they're not:

> names(strt$call)
[1] ""        "formula" "data"
> names(strt2$call)
[1] ""        "formula" "data"

So the difference must lie in the formula component, right? ...

> strt$call$formula
y1 ~ 1
> strt2$call$formula
y1 ~ 1

So, thus far, huhh? But..

> class(strt2$call$formula)
[1] "formula"

> class(strt$call$formula)
[1] "call"

So I think therein lies the critical difference that is screwing things up. NOTE: If I am wrong about this someone **PLEASE** correct me.

I see no clear workaround for this other than to explicitly avoid
passing a formula in the lm() call with y~1 or y ~ .   I think the
real fix is to make the  boot.stepAIC function smarter in how it handles its formula argument, and that is above my paygrade (and degree of interest) . You should probably email the maintainer, who may not monitor this list. But give it a day or so to give someone else a chance to correct me if I'm wrong.


HTH.

Cheers,

Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Aug 22, 2017 at 8:17 AM, Stephen O'hagan <[hidden email]> wrote:

> I'm trying to use boot.stepAIC for feature selection; I need to be able to specify the name of the dependent variable programmatically, but this appear to fail:
>
> In R-Studio with MS R Open 3.4:
>
> library(bootStepAIC)
>
> #Fake data
> n<-200
>
> x1 <- runif(n, -3, 3)
> x2 <- runif(n, -3, 3)
> x3 <- runif(n, -3, 3)
> x4 <- runif(n, -3, 3)
> x5 <- runif(n, -3, 3)
> x6 <- runif(n, -3, 3)
> x7 <- runif(n, -3, 3)
> x8 <- runif(n, -3, 3)
> y1 <- 42+x3 + 2*x6 + 3*x8 + runif(n, -0.5, 0.5)
>
> dat <- data.frame(x1,x2,x3,x4,x5,x6,x7,x8,y1)
> #the real data won't have these names...
>
> cn <- names(dat)
> trg <- "y1"
> xvars <- cn[cn!=trg]
>
> frm1<-as.formula(paste(trg,"~1"))
> frm2<-as.formula(paste(trg,"~ 1 + ",paste(xvars,collapse = "+")))
>
> strt=lm(y1~1,dat) # boot.stepAIC Works fine
>
> #strt=do.call("lm",list(frm1,data=dat)) ## boot.stepAIC FAILS ##
>
> #strt=lm(frm1,dat) ## boot.stepAIC FAILS ##
>
> limit<-5
>
>
> stp=stepAIC(strt,direction='forward',steps=limit,
>             scope=list(lower=frm1,upper=frm2))
>
> bst <- boot.stepAIC(strt,dat,B=50,alpha=0.05,direction='forward',steps=limit,
>                     scope=list(lower=frm1,upper=frm2))
>
> b1 <- bst$Covariates
> ball <- data.frame(b1)
> names(ball)=unlist(trg)
>
> Any ideas?
>
> Cheers,
> SOH
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: boot.stepAIC fails with computed formula

Stephen O'hagan
In reply to this post by Bert Gunter-2
As far as I can tell from the docs, strt & frm1 should be the simplest model, 'y1~1', since we’re doing stepAIC in the forward direction.

I found that using 'y1~.' for frm2 doesn't always work, hence the explicit enumeration of the xvars into a full formula. I had forgotten that you don't need the +1 term.

With real data, if you start from the most complex model and go backwards, stepAIC just over-fits and gives up  [for same reason I set the steps parameter].

Cheers,
SGO

-----Original Message-----
From: Bert Gunter [mailto:[hidden email]]
Sent: 22 August 2017 18:51
To: Stephen O'hagan <[hidden email]>
Cc: [hidden email]
Subject: Re: [R] boot.stepAIC fails with computed formula

SImplify your call to lm using the "." argument instead of manipulating formulas.
> strt <- lm(y1 ~ ., data = dat)

and you do not need to explicitly specify the "1+" on the rhs for lm, so

> frm2<-as.formula(paste(trg," ~  ", paste(xvars,collapse = "+")))
works fine, too.

Anyway, doing this gives (but see end of output)"


bst <- boot.stepAIC(strt,data =
dat,B=50,alpha=0.05,direction='forward',steps=limit,
                  scope=list(lower=frm1,upper=frm2))

> bst

Summary of Bootstrapping the 'stepAIC()' procedure for

Call:
lm(formula = y1 ~ ., data = dat)

Bootstrap samples: 50
Direction: forward
Penalty: 2 * df

Covariates selected
   (%)
x1 100
x2 100
x3 100
x4 100
x5 100
x6 100
x7 100
x8 100

Coefficients Sign
   + (%) - (%)
x3   100     0
x6   100     0
x8   100     0
x1    90    10
x2    86    14
x7    62    38
x4    44    56
x5     0   100

Stat Significance
   (%)
x3 100
x6 100
x8 100
x5  34
x2  20
x1  16
x4  10
x7   4


The stepAIC() for the original data-set gave

Call:
lm(formula = y1 ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8, data = dat)

Coefficients:
(Intercept)           x1           x2           x3           x4           x5
  42.008824     0.012304     0.010925     0.976469    -0.005183    -0.021041
         x6           x7           x8
   2.000649     0.004461     3.007071


Stepwise Model Path
Analysis of Deviance Table

Initial Model:
y1 ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8

Final Model:
y1 ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8


  Step Df Deviance Resid. Df Resid. Dev     AIC
1                        191   16.14592 -485.33


HOWEVER, I do not know why your failed calls failed. In view of the above, it appears to be a bug in the formula interface, so if you do not get a satisfactory answer here (i.e. I am wrong about this), you should contact the maintainer.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Aug 22, 2017 at 10:08 AM, Steve O'Hagan <[hidden email]> wrote:

> The error is "the model fit failed in 50 bootstrap samples
> Error: non-character argument"
>
> Cheers,
> SOH.
>
>
> On 22/08/2017 17:52, Bert Gunter wrote:
>>
>> Failed?  What was the error message?
>>
>> Cheers,
>>
>> Bert
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming
>> along and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Tue, Aug 22, 2017 at 8:17 AM, Stephen O'hagan
>> <[hidden email]> wrote:
>>>
>>> I'm trying to use boot.stepAIC for feature selection; I need to be
>>> able to specify the name of the dependent variable programmatically,
>>> but this appear to fail:
>>>
>>> In R-Studio with MS R Open 3.4:
>>>
>>> library(bootStepAIC)
>>>
>>> #Fake data
>>> n<-200
>>>
>>> x1 <- runif(n, -3, 3)
>>> x2 <- runif(n, -3, 3)
>>> x3 <- runif(n, -3, 3)
>>> x4 <- runif(n, -3, 3)
>>> x5 <- runif(n, -3, 3)
>>> x6 <- runif(n, -3, 3)
>>> x7 <- runif(n, -3, 3)
>>> x8 <- runif(n, -3, 3)
>>> y1 <- 42+x3 + 2*x6 + 3*x8 + runif(n, -0.5, 0.5)
>>>
>>> dat <- data.frame(x1,x2,x3,x4,x5,x6,x7,x8,y1)
>>> #the real data won't have these names...
>>>
>>> cn <- names(dat)
>>> trg <- "y1"
>>> xvars <- cn[cn!=trg]
>>>
>>> frm1<-as.formula(paste(trg,"~1"))
>>> frm2<-as.formula(paste(trg,"~ 1 + ",paste(xvars,collapse = "+")))
>>>
>>> strt=lm(y1~1,dat) # boot.stepAIC Works fine
>>>
>>> #strt=do.call("lm",list(frm1,data=dat)) ## boot.stepAIC FAILS ##
>>>
>>> #strt=lm(frm1,dat) ## boot.stepAIC FAILS ##
>>>
>>> limit<-5
>>>
>>>
>>> stp=stepAIC(strt,direction='forward',steps=limit,
>>>              scope=list(lower=frm1,upper=frm2))
>>>
>>> bst <-
>>> boot.stepAIC(strt,dat,B=50,alpha=0.05,direction='forward',steps=limit,
>>>                      scope=list(lower=frm1,upper=frm2))
>>>
>>> b1 <- bst$Covariates
>>> ball <- data.frame(b1)
>>> names(ball)=unlist(trg)
>>>
>>> Any ideas?
>>>
>>> Cheers,
>>> SOH
>>>
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: boot.stepAIC fails with computed formula

Heinz Tuechler
In reply to this post by Stephen O'hagan
It seems that if you build the formula as a character string, and
postpone the "as.formula" into the lm call, it works.

instead of
frm1 <- as.formula(paste(trg,"~1"))
use
frm1a <- paste(trg,"~1")
and then
strt <- lm(as.formula(frm1a),dat)

regards,

Heinz

Stephen O'hagan wrote/hat geschrieben on/am 23.08.2017 12:07:

> Until I get a fix that works, a work-around would be to rename the 'y1' column, used a fixed formula, and rename it back afterwards.
>
> Thanks for your help.
> SGO.
>
> -----Original Message-----
> From: Bert Gunter [mailto:[hidden email]]
> Sent: 22 August 2017 20:38
> To: Stephen O'hagan <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [R] boot.stepAIC fails with computed formula
>
> OK, here's the problem. Continuing with your example:
>
> strt1 <- lm(y1 ~1, dat)
> strt2 <- lm(frm1,dat)
>
>
>> strt1
>
> Call:
> lm(formula = y1 ~ 1, data = dat)
>
> Coefficients:
> (Intercept)
>       41.73
>
>> strt2
>
> Call:
> lm(formula = frm1, data = dat)
>
> Coefficients:
> (Intercept)
>       41.73
>
>
> Note that the formula objects of the lm object are different: strt2 does not evaluate the formula. So presumably boot.step.AIC does no evaluation and therefore gets confused with the errors you saw. So you need to get the evaluated formula into the lm object. This can be done, e.g. via:
>
>> strt2 <- eval(substitute(lm(form,data = dat), list(form = frm1)))
>
> ## yielding
>
>> strt2
>
> Call:
> lm(formula = y1 ~ 1, data = dat)
>
> Coefficients:
> (Intercept)
>       41.73
>
> So this looks like it should fix the problem, but alas no, the boot.stepAIC call still fails with the same error message. Here's why:
>
>> identical(strt$call, strt2$call)
> [1] FALSE
>
> So one might rightfully ask, what the heck is going on here?! Further digging:
>
>> str(strt$call)
>  language lm(formula = y1 ~ 1, data = dat)
>
>> str(strt2$call)
>  language lm(formula = y1 ~ 1, data = dat)
>
> These certainly look identical! -- but of course they're not:
>
>> names(strt$call)
> [1] ""        "formula" "data"
>> names(strt2$call)
> [1] ""        "formula" "data"
>
> So the difference must lie in the formula component, right? ...
>
>> strt$call$formula
> y1 ~ 1
>> strt2$call$formula
> y1 ~ 1
>
> So, thus far, huhh? But..
>
>> class(strt2$call$formula)
> [1] "formula"
>
>> class(strt$call$formula)
> [1] "call"
>
> So I think therein lies the critical difference that is screwing things up. NOTE: If I am wrong about this someone **PLEASE** correct me.
>
> I see no clear workaround for this other than to explicitly avoid
> passing a formula in the lm() call with y~1 or y ~ .   I think the
> real fix is to make the  boot.stepAIC function smarter in how it handles its formula argument, and that is above my paygrade (and degree of interest) . You should probably email the maintainer, who may not monitor this list. But give it a day or so to give someone else a chance to correct me if I'm wrong.
>
>
> HTH.
>
> Cheers,
>
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Aug 22, 2017 at 8:17 AM, Stephen O'hagan <[hidden email]> wrote:
>> I'm trying to use boot.stepAIC for feature selection; I need to be able to specify the name of the dependent variable programmatically, but this appear to fail:
>>
>> In R-Studio with MS R Open 3.4:
>>
>> library(bootStepAIC)
>>
>> #Fake data
>> n<-200
>>
>> x1 <- runif(n, -3, 3)
>> x2 <- runif(n, -3, 3)
>> x3 <- runif(n, -3, 3)
>> x4 <- runif(n, -3, 3)
>> x5 <- runif(n, -3, 3)
>> x6 <- runif(n, -3, 3)
>> x7 <- runif(n, -3, 3)
>> x8 <- runif(n, -3, 3)
>> y1 <- 42+x3 + 2*x6 + 3*x8 + runif(n, -0.5, 0.5)
>>
>> dat <- data.frame(x1,x2,x3,x4,x5,x6,x7,x8,y1)
>> #the real data won't have these names...
>>
>> cn <- names(dat)
>> trg <- "y1"
>> xvars <- cn[cn!=trg]
>>
>> frm1<-as.formula(paste(trg,"~1"))
>> frm2<-as.formula(paste(trg,"~ 1 + ",paste(xvars,collapse = "+")))
>>
>> strt=lm(y1~1,dat) # boot.stepAIC Works fine
>>
>> #strt=do.call("lm",list(frm1,data=dat)) ## boot.stepAIC FAILS ##
>>
>> #strt=lm(frm1,dat) ## boot.stepAIC FAILS ##
>>
>> limit<-5
>>
>>
>> stp=stepAIC(strt,direction='forward',steps=limit,
>>             scope=list(lower=frm1,upper=frm2))
>>
>> bst <- boot.stepAIC(strt,dat,B=50,alpha=0.05,direction='forward',steps=limit,
>>                     scope=list(lower=frm1,upper=frm2))
>>
>> b1 <- bst$Covariates
>> ball <- data.frame(b1)
>> names(ball)=unlist(trg)
>>
>> Any ideas?
>>
>> Cheers,
>> SOH
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: boot.stepAIC fails with computed formula

Bert Gunter-2
Thanks.

I suspect that would help to track down the problem in the underlying
code, but I am not sufficiently motivated to do so. Perhaps the
maintainer will if he sees this thread.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Aug 23, 2017 at 4:07 PM, Heinz Tuechler <[hidden email]> wrote:

> It seems that if you build the formula as a character string, and postpone
> the "as.formula" into the lm call, it works.
>
> instead of
> frm1 <- as.formula(paste(trg,"~1"))
> use
> frm1a <- paste(trg,"~1")
> and then
> strt <- lm(as.formula(frm1a),dat)
>
> regards,
>
> Heinz
>
> Stephen O'hagan wrote/hat geschrieben on/am 23.08.2017 12:07:
>>
>> Until I get a fix that works, a work-around would be to rename the 'y1'
>> column, used a fixed formula, and rename it back afterwards.
>>
>> Thanks for your help.
>> SGO.
>>
>> -----Original Message-----
>> From: Bert Gunter [mailto:[hidden email]]
>> Sent: 22 August 2017 20:38
>> To: Stephen O'hagan <[hidden email]>
>> Cc: [hidden email]
>> Subject: Re: [R] boot.stepAIC fails with computed formula
>>
>> OK, here's the problem. Continuing with your example:
>>
>> strt1 <- lm(y1 ~1, dat)
>> strt2 <- lm(frm1,dat)
>>
>>
>>> strt1
>>
>>
>> Call:
>> lm(formula = y1 ~ 1, data = dat)
>>
>> Coefficients:
>> (Intercept)
>>       41.73
>>
>>> strt2
>>
>>
>> Call:
>> lm(formula = frm1, data = dat)
>>
>> Coefficients:
>> (Intercept)
>>       41.73
>>
>>
>> Note that the formula objects of the lm object are different: strt2 does
>> not evaluate the formula. So presumably boot.step.AIC does no evaluation and
>> therefore gets confused with the errors you saw. So you need to get the
>> evaluated formula into the lm object. This can be done, e.g. via:
>>
>>> strt2 <- eval(substitute(lm(form,data = dat), list(form = frm1)))
>>
>>
>> ## yielding
>>
>>> strt2
>>
>>
>> Call:
>> lm(formula = y1 ~ 1, data = dat)
>>
>> Coefficients:
>> (Intercept)
>>       41.73
>>
>> So this looks like it should fix the problem, but alas no, the
>> boot.stepAIC call still fails with the same error message. Here's why:
>>
>>> identical(strt$call, strt2$call)
>>
>> [1] FALSE
>>
>> So one might rightfully ask, what the heck is going on here?! Further
>> digging:
>>
>>> str(strt$call)
>>
>>  language lm(formula = y1 ~ 1, data = dat)
>>
>>> str(strt2$call)
>>
>>  language lm(formula = y1 ~ 1, data = dat)
>>
>> These certainly look identical! -- but of course they're not:
>>
>>> names(strt$call)
>>
>> [1] ""        "formula" "data"
>>>
>>> names(strt2$call)
>>
>> [1] ""        "formula" "data"
>>
>> So the difference must lie in the formula component, right? ...
>>
>>> strt$call$formula
>>
>> y1 ~ 1
>>>
>>> strt2$call$formula
>>
>> y1 ~ 1
>>
>> So, thus far, huhh? But..
>>
>>> class(strt2$call$formula)
>>
>> [1] "formula"
>>
>>> class(strt$call$formula)
>>
>> [1] "call"
>>
>> So I think therein lies the critical difference that is screwing things
>> up. NOTE: If I am wrong about this someone **PLEASE** correct me.
>>
>> I see no clear workaround for this other than to explicitly avoid
>> passing a formula in the lm() call with y~1 or y ~ .   I think the
>> real fix is to make the  boot.stepAIC function smarter in how it handles
>> its formula argument, and that is above my paygrade (and degree of interest)
>> . You should probably email the maintainer, who may not monitor this list.
>> But give it a day or so to give someone else a chance to correct me if I'm
>> wrong.
>>
>>
>> HTH.
>>
>> Cheers,
>>
>> Bert
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along and
>> sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Tue, Aug 22, 2017 at 8:17 AM, Stephen O'hagan
>> <[hidden email]> wrote:
>>>
>>> I'm trying to use boot.stepAIC for feature selection; I need to be able
>>> to specify the name of the dependent variable programmatically, but this
>>> appear to fail:
>>>
>>> In R-Studio with MS R Open 3.4:
>>>
>>> library(bootStepAIC)
>>>
>>> #Fake data
>>> n<-200
>>>
>>> x1 <- runif(n, -3, 3)
>>> x2 <- runif(n, -3, 3)
>>> x3 <- runif(n, -3, 3)
>>> x4 <- runif(n, -3, 3)
>>> x5 <- runif(n, -3, 3)
>>> x6 <- runif(n, -3, 3)
>>> x7 <- runif(n, -3, 3)
>>> x8 <- runif(n, -3, 3)
>>> y1 <- 42+x3 + 2*x6 + 3*x8 + runif(n, -0.5, 0.5)
>>>
>>> dat <- data.frame(x1,x2,x3,x4,x5,x6,x7,x8,y1)
>>> #the real data won't have these names...
>>>
>>> cn <- names(dat)
>>> trg <- "y1"
>>> xvars <- cn[cn!=trg]
>>>
>>> frm1<-as.formula(paste(trg,"~1"))
>>> frm2<-as.formula(paste(trg,"~ 1 + ",paste(xvars,collapse = "+")))
>>>
>>> strt=lm(y1~1,dat) # boot.stepAIC Works fine
>>>
>>> #strt=do.call("lm",list(frm1,data=dat)) ## boot.stepAIC FAILS ##
>>>
>>> #strt=lm(frm1,dat) ## boot.stepAIC FAILS ##
>>>
>>> limit<-5
>>>
>>>
>>> stp=stepAIC(strt,direction='forward',steps=limit,
>>>             scope=list(lower=frm1,upper=frm2))
>>>
>>> bst <-
>>> boot.stepAIC(strt,dat,B=50,alpha=0.05,direction='forward',steps=limit,
>>>                     scope=list(lower=frm1,upper=frm2))
>>>
>>> b1 <- bst$Covariates
>>> ball <- data.frame(b1)
>>> names(ball)=unlist(trg)
>>>
>>> Any ideas?
>>>
>>> Cheers,
>>> SOH
>>>
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: boot.stepAIC fails with computed formula

Stephen O'hagan
In reply to this post by Heinz Tuechler
OK, I will try that.

I notified the maintainer of boot.stepAIC, so he might fix this in due course.

Thanks,
SGO.

-----Original Message-----
From: R-help [mailto:[hidden email]] On Behalf Of Heinz Tuechler
Sent: 24 August 2017 00:07
To: [hidden email]
Subject: Re: [R] boot.stepAIC fails with computed formula

It seems that if you build the formula as a character string, and postpone the "as.formula" into the lm call, it works.

instead of
frm1 <- as.formula(paste(trg,"~1"))
use
frm1a <- paste(trg,"~1")
and then
strt <- lm(as.formula(frm1a),dat)

regards,

Heinz

Stephen O'hagan wrote/hat geschrieben on/am 23.08.2017 12:07:

> Until I get a fix that works, a work-around would be to rename the 'y1' column, used a fixed formula, and rename it back afterwards.
>
> Thanks for your help.
> SGO.
>
> -----Original Message-----
> From: Bert Gunter [mailto:[hidden email]]
> Sent: 22 August 2017 20:38
> To: Stephen O'hagan <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [R] boot.stepAIC fails with computed formula
>
> OK, here's the problem. Continuing with your example:
>
> strt1 <- lm(y1 ~1, dat)
> strt2 <- lm(frm1,dat)
>
>
>> strt1
>
> Call:
> lm(formula = y1 ~ 1, data = dat)
>
> Coefficients:
> (Intercept)
>       41.73
>
>> strt2
>
> Call:
> lm(formula = frm1, data = dat)
>
> Coefficients:
> (Intercept)
>       41.73
>
>
> Note that the formula objects of the lm object are different: strt2 does not evaluate the formula. So presumably boot.step.AIC does no evaluation and therefore gets confused with the errors you saw. So you need to get the evaluated formula into the lm object. This can be done, e.g. via:
>
>> strt2 <- eval(substitute(lm(form,data = dat), list(form = frm1)))
>
> ## yielding
>
>> strt2
>
> Call:
> lm(formula = y1 ~ 1, data = dat)
>
> Coefficients:
> (Intercept)
>       41.73
>
> So this looks like it should fix the problem, but alas no, the boot.stepAIC call still fails with the same error message. Here's why:
>
>> identical(strt$call, strt2$call)
> [1] FALSE
>
> So one might rightfully ask, what the heck is going on here?! Further digging:
>
>> str(strt$call)
>  language lm(formula = y1 ~ 1, data = dat)
>
>> str(strt2$call)
>  language lm(formula = y1 ~ 1, data = dat)
>
> These certainly look identical! -- but of course they're not:
>
>> names(strt$call)
> [1] ""        "formula" "data"
>> names(strt2$call)
> [1] ""        "formula" "data"
>
> So the difference must lie in the formula component, right? ...
>
>> strt$call$formula
> y1 ~ 1
>> strt2$call$formula
> y1 ~ 1
>
> So, thus far, huhh? But..
>
>> class(strt2$call$formula)
> [1] "formula"
>
>> class(strt$call$formula)
> [1] "call"
>
> So I think therein lies the critical difference that is screwing things up. NOTE: If I am wrong about this someone **PLEASE** correct me.
>
> I see no clear workaround for this other than to explicitly avoid
> passing a formula in the lm() call with y~1 or y ~ .   I think the
> real fix is to make the  boot.stepAIC function smarter in how it handles its formula argument, and that is above my paygrade (and degree of interest) . You should probably email the maintainer, who may not monitor this list. But give it a day or so to give someone else a chance to correct me if I'm wrong.
>
>
> HTH.
>
> Cheers,
>
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Aug 22, 2017 at 8:17 AM, Stephen O'hagan <[hidden email]> wrote:
>> I'm trying to use boot.stepAIC for feature selection; I need to be able to specify the name of the dependent variable programmatically, but this appear to fail:
>>
>> In R-Studio with MS R Open 3.4:
>>
>> library(bootStepAIC)
>>
>> #Fake data
>> n<-200
>>
>> x1 <- runif(n, -3, 3)
>> x2 <- runif(n, -3, 3)
>> x3 <- runif(n, -3, 3)
>> x4 <- runif(n, -3, 3)
>> x5 <- runif(n, -3, 3)
>> x6 <- runif(n, -3, 3)
>> x7 <- runif(n, -3, 3)
>> x8 <- runif(n, -3, 3)
>> y1 <- 42+x3 + 2*x6 + 3*x8 + runif(n, -0.5, 0.5)
>>
>> dat <- data.frame(x1,x2,x3,x4,x5,x6,x7,x8,y1)
>> #the real data won't have these names...
>>
>> cn <- names(dat)
>> trg <- "y1"
>> xvars <- cn[cn!=trg]
>>
>> frm1<-as.formula(paste(trg,"~1"))
>> frm2<-as.formula(paste(trg,"~ 1 + ",paste(xvars,collapse = "+")))
>>
>> strt=lm(y1~1,dat) # boot.stepAIC Works fine
>>
>> #strt=do.call("lm",list(frm1,data=dat)) ## boot.stepAIC FAILS ##
>>
>> #strt=lm(frm1,dat) ## boot.stepAIC FAILS ##
>>
>> limit<-5
>>
>>
>> stp=stepAIC(strt,direction='forward',steps=limit,
>>             scope=list(lower=frm1,upper=frm2))
>>
>> bst <- boot.stepAIC(strt,dat,B=50,alpha=0.05,direction='forward',steps=limit,
>>                     scope=list(lower=frm1,upper=frm2))
>>
>> b1 <- bst$Covariates
>> ball <- data.frame(b1)
>> names(ball)=unlist(trg)
>>
>> Any ideas?
>>
>> Cheers,
>> SOH
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.