dear all,
Is the following intentional? Am I missing anything in documentation? d<-data.frame(y=rnorm(10,5,.5),exp=rnorm(10), age=rnorm(10)) formula(lm(exp(y)~exp+age, data=d)) #--> exp(y) ~ exp + age formula(lm(exp(y)~., data=d)) #--> exp(y) ~ age variable 'exp' (maybe indicating "experience") is not included in the model. The same happens with 'log' (and other function names, I suppose..) best, vito -- ============================================== Vito M.R. Muggeo Dip.to Sc Econom, Az e Statistiche Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo Associate Editor, Statistical Modelling Chair, Statistical Modelling Society ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Functions are first class objects, so some kind of collision is bound to happen if you do this... so don't.
-- Sent from my phone. Please excuse my brevity. On January 30, 2018 3:11:56 AM PST, "Vito M. R. Muggeo" <[hidden email]> wrote: >dear all, >Is the following intentional? Am I missing anything in documentation? > >d<-data.frame(y=rnorm(10,5,.5),exp=rnorm(10), age=rnorm(10)) >formula(lm(exp(y)~exp+age, data=d)) >#--> exp(y) ~ exp + age > >formula(lm(exp(y)~., data=d)) >#--> exp(y) ~ age > >variable 'exp' (maybe indicating "experience") is not included in the >model. The same happens with 'log' (and other function names, I >suppose..) > >best, >vito ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Well...
?terms.formula says: "data: a data frame from which the meaning of the special symbol . can be inferred. It is unused if there is no . in the formula." So this seems to me to be an obscure bug, as I have found no warning against this admittedly confusing but still, I think, legal syntax. Note: > d <- data.frame(log = runif(10), x = 1:10) > y <- rnorm(10,5) > m1 <- lm(y ~ ., data = d) > formula(m1) y ~ log + x > m2 <- update(m1, formula =log(y) ~.) > formula(m2) log(y) ~ log + x > m3 = lm(log(y) ~., data =d) > formula(m3) log(y) ~ x As always, correction appreciated if I'm wrong. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Jan 30, 2018 at 6:23 AM, Jeff Newmiller <[hidden email]> wrote: > > Functions are first class objects, so some kind of collision is bound to happen if you do this... so don't. > -- > Sent from my phone. Please excuse my brevity. > > On January 30, 2018 3:11:56 AM PST, "Vito M. R. Muggeo" <[hidden email]> wrote: > >dear all, > >Is the following intentional? Am I missing anything in documentation? > > > >d<-data.frame(y=rnorm(10,5,.5),exp=rnorm(10), age=rnorm(10)) > >formula(lm(exp(y)~exp+age, data=d)) > >#--> exp(y) ~ exp + age > > > >formula(lm(exp(y)~., data=d)) > >#--> exp(y) ~ age > > > >variable 'exp' (maybe indicating "experience") is not included in the > >model. The same happens with 'log' (and other function names, I > >suppose..) > > > >best, > >vito > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
I poked at this a little bit and found that the issue exists in
stats:::C_termsform (which is called by terms.formula). Here is a variation on the demonstrations provided by Vito and Bert earlier: d<-data.frame(y=rnorm(10,5,.5), age=rnorm(10), exp=rnorm(10), log = runif(10)) fs <- list(y ~ ., exp(y) ~ ., log(y) ~ .) lapply(fs, function(x) terms(x, data = d)[[3]]) ## [[1]] ## age + exp + log ## [[2]] ## age + log ## [[3]] ## age + exp lapply(fs, function(x) .External(stats:::C_termsform, x, NULL, d, FALSE, FALSE)[[3]]) ## [[1]] ## age + exp + log ## [[2]] ## age + log ## [[3]] ## age + exp I don't speak C so I stopped there. Best, Ista On Tue, Jan 30, 2018 at 11:12 AM, Bert Gunter <[hidden email]> wrote: > Well... > > ?terms.formula says: > > "data: a data frame from which the meaning of the special symbol . can > be inferred. It is unused if there is no . in the formula." > > So this seems to me to be an obscure bug, as I have found no warning > against this admittedly confusing but still, I think, legal syntax. > Note: > >> d <- data.frame(log = runif(10), x = 1:10) >> y <- rnorm(10,5) > >> m1 <- lm(y ~ ., data = d) >> formula(m1) > y ~ log + x > >> m2 <- update(m1, formula =log(y) ~.) >> formula(m2) > log(y) ~ log + x > >> m3 = lm(log(y) ~., data =d) >> formula(m3) > log(y) ~ x > > As always, correction appreciated if I'm wrong. > > Cheers, > Bert > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > On Tue, Jan 30, 2018 at 6:23 AM, Jeff Newmiller > <[hidden email]> wrote: >> >> Functions are first class objects, so some kind of collision is bound to happen if you do this... so don't. >> -- >> Sent from my phone. Please excuse my brevity. >> >> On January 30, 2018 3:11:56 AM PST, "Vito M. R. Muggeo" <[hidden email]> wrote: >> >dear all, >> >Is the following intentional? Am I missing anything in documentation? >> > >> >d<-data.frame(y=rnorm(10,5,.5),exp=rnorm(10), age=rnorm(10)) >> >formula(lm(exp(y)~exp+age, data=d)) >> >#--> exp(y) ~ exp + age >> > >> >formula(lm(exp(y)~., data=d)) >> >#--> exp(y) ~ age >> > >> >variable 'exp' (maybe indicating "experience") is not included in the >> >model. The same happens with 'log' (and other function names, I >> >suppose..) >> > >> >best, >> >vito >> >> ______________________________________________ >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Free forum by Nabble | Edit this page |