Problems with coxph and survfit in a stratified model with interactions

classic Classic list List threaded Threaded
7 messages Options
rm
Reply | Threaded
Open this post in threaded view
|

Problems with coxph and survfit in a stratified model with interactions

rm
I’m trying to set up proportional hazard model that is stratified with respect to covariate 1 and has an interaction between covariate 1 and another variable, covariate 2. Both variables are categorical. In the following, I try to illustrate the two problems that I’ve encountered, using the lung dataset.

require(survival)
attach(lung)
cov1 <- as.factor(ph.ecog)
cov2 <- as.factor(sex)
levels(cov1)[levels(cov1)==0] <- "zero"
levels(cov1)[levels(cov1)==1] <- "one"
levels(cov1)[levels(cov1)==2] <- "two"
levels(cov1)[levels(cov1)==3] <- "three"
levels(cov2)[levels(cov2)==1] <- "male"
levels(cov2)[levels(cov2)==2] <- "female"
sCox <- coxph(Surv(time, status) ~ strata(cov1) + cov1:cov2)
df <- data.frame(
  cov1="zero",
  cov2="male"
)
sfCox <- survfit(sCox,newdata=df)

The first problem is the warning:

”Warning message: In coxph(Surv(time, status) ~ strata(cov1) + cov1:cov2) :
  X matrix deemed to be singular; variable 4 5 6 7 8"

To me, it seems that there are too many dummies generated.

The second problem is the error:

” Error in model.frame.default(data = df, formula = ~cov1:cov2) : 
 number of variables != number of variable names”. 
rm
Reply | Threaded
Open this post in threaded view
|

Re: Problems with coxph and survfit in a stratified model with interactions

rm
Dear all,

Reading a previous post to this forum, I think I managed to get the part of the code that generates the dataframe (that I input as newdata to survift) fixed. The problem was that I had not specified the full set of possible levels for the 2 factors in the dataframe.

Unfortunately, I still get the error ” number of variables != number of variable names” when I try to fit newdata to a very simple model that is stratified with respect to cov1 and has an interaction between cov1 and cov2. It seems that it is the interaction term cov1:cov2 that causes the problem.

Any help would be much appreciated. Complete code follows.

Roland



require(survival)
data(lung)
#
lung$cov1 <- as.factor(lung$ph.ecog)
lung$cov2 <- as.factor(lung$sex)
levels(lung$cov1)[levels(lung$cov1)==0] <- "zero"
levels(lung$cov1)[levels(lung$cov1)==1] <- "one"
levels(lung$cov1)[levels(lung$cov1)==2] <- "two"
levels(lung$cov1)[levels(lung$cov1)==3] <- "three"
levels(lung$cov2)[levels(lung$cov2)==1] <- "male"
levels(lung$cov2)[levels(lung$cov2)==2] <- "female"
#
df <- data.frame(
  cov1=factor("one", levels = levels(lung$cov1)),
  cov2=factor("male", levels = levels(lung$cov2))
)
#the following 2 lines work
sCox <- coxph(Surv(time, status) ~ cov1 + cov1:cov2, data=lung)
sfCox <- survfit(sCox,newdata=df)
#as do the following
sCox <- coxph(Surv(time, status) ~ strata(cov1) + cov2, data=lung)
sfCox <- survfit(sCox,newdata=df)
#whereas the following 2 lines doesn't, any suggestions of how to fix them?
sCox <- coxph(Surv(time, status) ~ strata(cov1) + cov1:cov2, data=lung)
sfCox <- survfit(sCox,newdata=df)
Reply | Threaded
Open this post in threaded view
|

Re: Problems with coxph and survfit in a stratified model with interactions

Therneau, Terry M., Ph.D.
In reply to this post by rm
Your df object (newdata) has to have ALL the variables in it.  Normally you wouldn't need
the strata variable, but in this case cov1 is also a covariate.  Look at the example I
gave earlier.

Terry T.

On 10/15/2012 05:00 AM, [hidden email] wrote:

> Many thanks for your very quick reply!  I tried replacing ?~ strata(cov1) +
> cov1:cov2 ? with  ?~ strata(cov1):cov2 ?. As a result, I get an error
> message saying that ?object 'cov1' not found?. I guess R refers to the data
> frame. Any idea how to fix the error? Complete code follows.
>
> Best regards,
> Roland
>
>
>
> require(survival)
> data(lung)
> #
> lung$cov1<- as.factor(lung$ph.ecog)
> lung$cov2<- as.factor(lung$sex)
> levels(lung$cov1)[levels(lung$cov1)==0]<- "zero"
> levels(lung$cov1)[levels(lung$cov1)==1]<- "one"
> levels(lung$cov1)[levels(lung$cov1)==2]<- "two"
> levels(lung$cov1)[levels(lung$cov1)==3]<- "three"
> levels(lung$cov2)[levels(lung$cov2)==1]<- "male"
> levels(lung$cov2)[levels(lung$cov2)==2]<- "female"
> #
> df<- data.frame(
>    cov2=factor("female", levels = levels(lung$cov2))
> )
> sCox<- coxph(Surv(time, status) ~ strata(cov1):cov2, data=lung)
> sfCox<- survfit(sCox,newdata=df)
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problems with coxph and survfit in a stratified model with interactions

Chris Andrews
In reply to this post by rm
rm wrote
I’m trying to set up proportional hazard model that is stratified with respect to covariate 1 and has an interaction between covariate 1 and another variable, covariate 2. Both variables are categorical. In the following, I try to illustrate the two problems that I’ve encountered, using the lung dataset.

require(survival)
attach(lung)
cov1 <- as.factor(ph.ecog)
cov2 <- as.factor(sex)
levels(cov1)[levels(cov1)==0] <- "zero"
levels(cov1)[levels(cov1)==1] <- "one"
levels(cov1)[levels(cov1)==2] <- "two"
levels(cov1)[levels(cov1)==3] <- "three"
levels(cov2)[levels(cov2)==1] <- "male"
levels(cov2)[levels(cov2)==2] <- "female"
sCox <- coxph(Surv(time, status) ~ strata(cov1) + cov1:cov2)
df <- data.frame(
  cov1="zero",
  cov2="male"
)
sfCox <- survfit(sCox,newdata=df)

The first problem is the warning:

”Warning message: In coxph(Surv(time, status) ~ strata(cov1) + cov1:cov2) :
  X matrix deemed to be singular; variable 4 5 6 7 8"

To me, it seems that there are too many dummies generated.

The second problem is the error:

” Error in model.frame.default(data = df, formula = ~cov1:cov2) : 
 number of variables != number of variable names”. 
Adding the main effect of cov2 allows the model to run

library(survival)

lung$cov1 <- factor(lung$ph.ecog, 0:3, c("zero","one","two","three"))
lung$cov2 <- factor(lung$sex, 1:2, c("male","female"))

sCox <- coxph(Surv(time, status) ~ strata(cov1) + cov2 + cov1:cov2, data=lung)

df <- data.frame(
  cov1=factor(0:3, 0:3, c("zero","one","two","three"))[1],
  cov2=factor(1:2, 1:2, c("male","female"))[1]
)
df

sfCox <- survfit(sCox, newdata=df)

plot(sfCox[1], conf=FALSE)
lines(sfCox[2], col=2)
lines(sfCox[3], col=3)
lines(sfCox[4], col=4)
rm
Reply | Threaded
Open this post in threaded view
|

Re: Problems with coxph and survfit in a stratified model with interactions

rm
In reply to this post by rm
Dear Terry,

Thanks for your valuable comments! Could you please restate the example that you refer to? If I include cov1 in the data frame, as you suggest, I get the following error message. Complete code follows.

” Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
  contrasts can be applied only to factors with 2 or more levels"

Regarding Chris’s suggestion, I would prefer not to have the main effect of cov2 in the model. I would prefer to stick with “~ strata(cov1):cov2” to keep the interpretation of the regression coefficient straightforward.

Best regards,
Roland



require(survival)
data(lung)
#
lung$cov1 <- as.factor(lung$ph.ecog)
lung$cov2 <- as.factor(lung$sex)
levels(lung$cov1)[levels(lung$cov1)==0] <- "zero"
levels(lung$cov1)[levels(lung$cov1)==1] <- "one"
levels(lung$cov1)[levels(lung$cov1)==2] <- "two"
levels(lung$cov1)[levels(lung$cov1)==3] <- "three"
levels(lung$cov2)[levels(lung$cov2)==1] <- "male"
levels(lung$cov2)[levels(lung$cov2)==2] <- "female"
#
df <- data.frame(
  cov1=factor("one", levels = levels(lung$cov1)),
  cov2=factor("female", levels = levels(lung$cov2))
)
sCox <- coxph(Surv(time, status) ~ strata(cov1):cov2, data=lung)
sfCox <- survfit(sCox,newdata=df)
Reply | Threaded
Open this post in threaded view
|

Re: Problems with coxph and survfit in a stratified model with interactions

Andrews, Chris
Hi Roland (and Terry),

Is this the model you want to fit?
--a separate 'baseline' hazard for each stratum defined by cov1
--a coefficient for cov2 that is different for each stratum defined by cov1

Then you can have a separate call to coxph for each stratum.
sCox0 <- coxph(Surv(time, status) ~ cov2, data=lung, subset=cov1=="zero")
sCox1 <- coxph(Surv(time, status) ~ cov2, data=lung, subset=cov1=="one")
sCox2 <- coxph(Surv(time, status) ~ cov2, data=lung, subset=cov1=="two")

I compared betas of the two approaches and found them to be equivalent.
***HOWEVER, the plots of stratum=="two" for men are not equal.  And I don't yet understand that.

Chris

library(survival)

lung$cov1 <- factor(lung$ph.ecog, 0:3, c("zero","one","two","three"))
lung$cov2 <- factor(lung$sex, 1:2, c("male","female"))

# first suggestion
sCox <- coxph(Surv(time, status) ~ strata(cov1) + cov2 + cov1 :cov2, data=lung) # warning, X matrix singular
# Fit strata separately
sCox0 <- coxph(Surv(time, status) ~ cov2, data=lung, subset=cov1=="zero")
sCox1 <- coxph(Surv(time, status) ~ cov2, data=lung, subset=cov1=="one")
sCox2 <- coxph(Surv(time, status) ~ cov2, data=lung, subset=cov1=="two")
sCox3 <- coxph(Surv(time, status) ~ cov2, data=lung, subset=cov1=="three") # warning, only 1 observation

# compare summaries
summary(sCox)
#cov2female           -0.6151
#cov2male:cov1one      0.0182
#cov2male:cov1two     -0.2513
summary(sCox0)
#cov2female -0.615 (as above)
summary(sCox1)
#cov2female -0.633 (-0.6151 - 0.0182)
summary(sCox2)
#cov2female -0.364 (-0.6151 + 0.2513)
summary(sCox3)
#cov2female    0         1        0 NA       NA

# plot
df <- data.frame(
  cov1 =factor(rep(0:3      , each=2), 0:3, c("zero","one","two","three")),
  cov2 =factor(rep(1:2      ,times=4), 1:2, c("male","female"))
)
df

# fit 4 curves for men, method 1
sfCox  <- survfit(sCox , newdata=df[1,])
# fit 4 curves for men, method 2
sfCox0 <- survfit(sCox0, newdata=df[1,])
sfCox1 <- survfit(sCox1, newdata=df[3,])
sfCox2 <- survfit(sCox2, newdata=df[5,])
sfCox3 <- survfit(sCox3, newdata=df[7,])

# plot, men, method 1
plot (sfCox[1], col=1, mark.time=FALSE, conf=FALSE)
lines(sfCox[2], col=2, mark.time=FALSE)
lines(sfCox[3], col=3, mark.time=FALSE)
lines(sfCox[4], col=4, mark.time=FALSE)
# plot, men, method 2
lines(sfCox0, lty=2, lwd=3, mark.time=FALSE, col=1)
lines(sfCox1, lty=2, lwd=3, mark.time=FALSE, col=2)
lines(sfCox2, lty=2, lwd=3, mark.time=FALSE, col=3) # miss?!
lines(sfCox3, lty=2, lwd=3, mark.time=FALSE, col=4)

# repeat for women
# fit 4 curves for women, method 1
sfCoxw  <- survfit(sCox , newdata=df[2,])
# fit 4 curves for women, method 2
sfCoxw0 <- survfit(sCox0, newdata=df[2,])
sfCoxw1 <- survfit(sCox1, newdata=df[4,])
sfCoxw2 <- survfit(sCox2, newdata=df[6,])
sfCoxw3 <- survfit(sCox3, newdata=df[8,])

# plot, women, method 1
plot (sfCoxw[1], col=1, mark.time=FALSE, conf=FALSE)
lines(sfCoxw[2], col=2, mark.time=FALSE)
lines(sfCoxw[3], col=3, mark.time=FALSE)
lines(sfCoxw[4], col=4, mark.time=FALSE)
# plot, women, method 2
lines(sfCoxw0, lty=2, lwd=3, mark.time=FALSE, col=1)
lines(sfCoxw1, lty=2, lwd=3, mark.time=FALSE, col=2)
lines(sfCoxw2, lty=2, lwd=3, mark.time=FALSE, col=3)
lines(sfCoxw3, lty=2, lwd=3, mark.time=FALSE, col=4)



-----Original Message-----
From: rm [mailto:[hidden email]]
Sent: Monday, October 15, 2012 11:10 AM
To: [hidden email]
Subject: Re: [R] Problems with coxph and survfit in a stratified model with interactions

Dear Terry,

Thanks for your valuable comments! Could you please restate the example that you refer to? If I include cov1 in the data frame, as you suggest, I get the following error message. Complete code follows.

” Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
  contrasts can be applied only to factors with 2 or more levels"

Regarding Chris’s suggestion, I would prefer not to have the main effect of
cov2 in the model. I would prefer to stick with “~ strata(cov1):cov2” to keep the interpretation of the regression coefficient straightforward.

Best regards,
Roland



require(survival)
data(lung)
#
lung$cov1 <- as.factor(lung$ph.ecog)
lung$cov2 <- as.factor(lung$sex)
levels(lung$cov1)[levels(lung$cov1)==0] <- "zero"
levels(lung$cov1)[levels(lung$cov1)==1] <- "one"
levels(lung$cov1)[levels(lung$cov1)==2] <- "two"
levels(lung$cov1)[levels(lung$cov1)==3] <- "three"
levels(lung$cov2)[levels(lung$cov2)==1] <- "male"
levels(lung$cov2)[levels(lung$cov2)==2] <- "female"
#
df <- data.frame(
  cov1=factor("one", levels = levels(lung$cov1)),
  cov2=factor("female", levels = levels(lung$cov2))
)
sCox <- coxph(Surv(time, status) ~ strata(cov1):cov2, data=lung) sfCox <- survfit(sCox,newdata=df)



--
View this message in context: http://r.789695.n4.nabble.com/Problems-with-coxph-and-survfit-in-a-stratified-model-with-interactions-tp4646096p4646245.html
Sent from the R help mailing list archive at Nabble.com.


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
rm
Reply | Threaded
Open this post in threaded view
|

Re: Problems with coxph and survfit in a stratified model with interactions

rm
Hi Chris,

Yes, that is exactly the model I want to fit, i.e. a separate 'baseline' hazard for each stratum (defined by cov1) and a coefficient for cov2 that is different for each stratum.

Is suspect that there is a problem with the following line of your code.

sCox <- coxph(Surv(time, status) ~ strata(cov1) + cov2 + cov1 :cov2, data=lung) # warning, X matrix singular

The reason why I suspect is that in an earlier post  Terry wrote that ” ..by using ~ strata(cov1) + cov1:cov2 you fooled the program into thinking that there was no strata by covariate interaction, ”

Best regards,
Roland