Different behavior of model.matrix between R 3.2 and R 3.1.1

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Different behavior of model.matrix between R 3.2 and R 3.1.1

Frank Harrell
For building design matrices for Cox proportional hazards models in the
cph function in the rms package I have always used this construct:

Terms <- terms(formula, specials=c("strat", "cluster", "strata"), data=data)
specials <- attr(Terms, 'specials')
stra    <- specials$strat
Terms.ns     <- Terms
     if(length(stra)) {
       temp <- untangle.specials(Terms.ns, "strat", 1)
       Terms.ns <- Terms.ns[- temp$terms]    #uses [.terms function
     }
X <- model.matrix(Terms.ns, X)[, -1, drop=FALSE]

The Terms.ns logic removes stratification factor "main effects" so that
if a stratification factor interacts with a non-stratification factor,
only the interaction terms are included, not the strat. factor main
effects. [In a Cox PH model stratification goes into the nonparametric
survival curve part of the model].

Lately this logic quit working; model.matrix keeps the unneeded main
effects in the design matrix.  Does anyone know what changed in R that
could have caused this, and possibly a workaround?

Note that cph is a kind of front-end to the survival package's coxph
function.  Therry Therneau uses more complex logic to construct the
design matrix reliably.  I'd like to avoid that logic because it creates
an overly wide design matrix before removing the unneeded columns.

Thanks for any assistance,
Frank


--
------------------------------------------------------------------------
Frank E Harrell Jr Professor and Chairman School of Medicine

        Department of *Biostatistics* *Vanderbilt University*


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Frank Harrell
Department of Biostatistics, Vanderbilt University