Different behavior of model.matrix between R 3.2 and R3.1.1

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Different behavior of model.matrix between R 3.2 and R3.1.1

Frank Harrell
Terry Therneau has been very helpful on r-help but we can't figure out
what change in R in the past months made extra columns appear in
model.matrix when the terms object is subsetted to remove stratification
factors in a Cox model.  Terry has changed his logic in the survival
package to avoid this issue but he requires generating a larger design
matrix then dropping columns.

A simple example is below.


strat <- function(x) x
d <- expand.grid(a=c('a1','a2'), b=c('b1','b2'))
d$y <- c(1,3,2,4)
f <- y ~ a * strat(b)
m <- model.frame(f, data=d)
Terms <- drop.terms(terms(f, data=d), 2)
model.matrix(Terms, m)

   (Intercept) aa2 aa1:strat(b)b2 aa2:strat(b)b2
1           1   0              0              0
2           1   1              0              0
3           1   0              1              0
4           1   1              0              1
. . .

The column corresponding to a='a1' b='b2' should not be there
(aa1:strat(b)b2).

This does seem to be a change in R.  Any help appreciated.


Terms attributes factor and term.labels are:

attr(,"factors")
          a a:strat(b)
y        0          0
a        1          2
strat(b) 0          1
attr(,"term.labels")
[1] "a"          "a:strat(b)"


Frank

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Frank Harrell
Department of Biostatistics, Vanderbilt University
Reply | Threaded
Open this post in threaded view
|

Re: Different behavior of model.matrix between R 3.2 and R3.1.1

Berry, Charles
On Tue, 16 Jun 2015, Frank Harrell wrote:

> Terry Therneau has been very helpful on r-help but we can't figure out what
> change in R in the past months made extra columns appear in model.matrix when
> the terms object is subsetted to remove stratification factors in a Cox
> model.  Terry has changed his logic in the survival package to avoid this
> issue but he requires generating a larger design matrix then dropping
> columns.
>
> A simple example is below.
>
>
> strat <- function(x) x
> d <- expand.grid(a=c('a1','a2'), b=c('b1','b2'))
> d$y <- c(1,3,2,4)
> f <- y ~ a * strat(b)
> m <- model.frame(f, data=d)
> Terms <- drop.terms(terms(f, data=d), 2)
> model.matrix(Terms, m)
>
>  (Intercept) aa2 aa1:strat(b)b2 aa2:strat(b)b2
> 1           1   0              0              0
> 2           1   1              0              0
> 3           1   0              1              0
> 4           1   1              0              1
> . . .
>
> The column corresponding to a='a1' b='b2' should not be there
> (aa1:strat(b)b2).
>
> This does seem to be a change in R.  Any help appreciated.

I get the same results with "Trick or Treat" == R 2.15.2, so the change
must be before late 2012.

HTH,

Chuck

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel