survreg vs. aftreg (eha) - the relationship between fitted coefficients?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

survreg vs. aftreg (eha) - the relationship between fitted coefficients?

Eleni Rapsomaniki-2
Dear R-users,

I need to use the aftreg function in package 'eha' to estimate failure times for left truncated survival data. Apparently, survreg still cannot fit such models. Both functions should be fitting the accelerated failure time (Weibull) model. However, as Göran Broström points out in the help file for aftreg, the parameterisation is different giving rise to different coefficients. The betas for adjusted covariates are opposite in sign but otherwise identical, whereas the intercept is quite different in a non-obvious way. The log-likelihoods are similar also, but not identical. I would like to find out how I can convert one set of coefficients to the other so as to obtain the same linear predictors using either model. Any ideas???

#the example below uses right-censored data for simplicity (the principle should be the same with left truncation I hope)
library(survival)
library(eha)

#  COMPARE coefs between survreg ('survival' pkg) and aftreg ('eha' pkg)
#Fitting NULL models (no covariates) results in (approximately) the same coefs (which is good!)

m1_NULL=survreg(Surv(futime/365, status==1) ~ 1, data=pbcseq)
m2_NULL=aftreg(Surv(futime/365, status==1) ~ 1, data=pbcseq)

c(m1_NULL$coef, 1/m1_NULL$scale) #--> intercept= 3.878656  ,  shape = 1.478177
c(m2_NULL$coef[1], exp(m2_NULL$coef[2])) #--> intercept= 3.878859 ,  shape=1.478150


# NOW I adjust for covariates

m1=survreg(Surv(futime/365, status==1) ~ chol+stage, data=pbcseq)
m2= aftreg(Surv(futime/365, status==1) ~ chol+stage, data=pbcseq)

###      m2  #######
#Coefficients:
# (Intercept)         chol        stage
# 5.944641913 -0.001692574 -0.470861324

#Scale= 0.6416744

#Loglik(model)= -483.9   Loglik(intercept only)= -506.8
#        Chisq= 45.91 on 2 degrees of freedom, p= 1.1e-10
#n=1124 (821 observations deleted due to missingness)

###      m2  #######

#Covariate          W.mean      Coef Exp(Coef)  se(Coef)    Wald p
#chol              303.777     0.002     1.002     0.000     0.000
#stage               3.298     0.460     1.584     0.119     0.000
#
#log(scale)                    5.029   152.807     0.477     0.000
#log(shape)                    0.467     1.595     0.095     0.000
#
#Events                    92
#Total time at risk          9017
#Max. log. likelihood      -484.31
#LR test statistic         45.0
#Degrees of freedom        2
#Overall p-value           1.64669e-10

Many thanks for any help you may be able to provide.

Eleni Rapsomaniki
Research Associate
University of Cambridge
Institute of Primary and Public Health

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: survreg vs. aftreg (eha) - the relationship between fitted coefficients?

Göran Broström
On Fri, Dec 10, 2010 at 5:21 PM, Eleni Rapsomaniki
<[hidden email]> wrote:
> Dear R-users,
>
> I need to use the aftreg function in package 'eha' to estimate failure times for left truncated survival data.

Be careful! This is only possible under strong assumptions about the
(unobserved) covariate vector. This is not (yet) reflected on the help
page for aftreg, but has been discussed here earlier.

 Apparently, survreg still cannot fit such models. Both functions
should be fitting the accelerated failure time (Weibull) model.
However, as Göran Broström points out in the help file for aftreg, the
parameterisation is different giving rise to different coefficients.
The betas for adjusted covariates are opposite in sign but otherwise
identical, whereas the intercept is quite different in a non-obvious
way. The log-likelihoods are similar also, but not identical. I would
like to find out how I can convert one set of coefficients to the
other so as to obtain the same linear predictors using either model.
Any ideas???

Yes. Since you work with Weibull data, try 'phreg' instead. You will
have to change sign of estimated coefficients, and divide them by
estimated shape, see the help page for 'weibreg', to compare to
parameters from survreg, but they are directly comparable with those
from coxph and coxreg.

Göran


>
> #the example below uses right-censored data for simplicity (the principle should be the same with left truncation I hope)
> library(survival)
> library(eha)
>
> #  COMPARE coefs between survreg ('survival' pkg) and aftreg ('eha' pkg)
> #Fitting NULL models (no covariates) results in (approximately) the same coefs (which is good!)
>
> m1_NULL=survreg(Surv(futime/365, status==1) ~ 1, data=pbcseq)
> m2_NULL=aftreg(Surv(futime/365, status==1) ~ 1, data=pbcseq)
>
> c(m1_NULL$coef, 1/m1_NULL$scale) #--> intercept= 3.878656  ,  shape = 1.478177
> c(m2_NULL$coef[1], exp(m2_NULL$coef[2])) #--> intercept= 3.878859 ,  shape=1.478150
>
>
> # NOW I adjust for covariates
>
> m1=survreg(Surv(futime/365, status==1) ~ chol+stage, data=pbcseq)
> m2= aftreg(Surv(futime/365, status==1) ~ chol+stage, data=pbcseq)
>
> ###      m2  #######
> #Coefficients:
> # (Intercept)         chol        stage
> # 5.944641913 -0.001692574 -0.470861324
>
> #Scale= 0.6416744
>
> #Loglik(model)= -483.9   Loglik(intercept only)= -506.8
> #        Chisq= 45.91 on 2 degrees of freedom, p= 1.1e-10
> #n=1124 (821 observations deleted due to missingness)
>
> ###      m2  #######
>
> #Covariate          W.mean      Coef Exp(Coef)  se(Coef)    Wald p
> #chol              303.777     0.002     1.002     0.000     0.000
> #stage               3.298     0.460     1.584     0.119     0.000
> #
> #log(scale)                    5.029   152.807     0.477     0.000
> #log(shape)                    0.467     1.595     0.095     0.000
> #
> #Events                    92
> #Total time at risk          9017
> #Max. log. likelihood      -484.31
> #LR test statistic         45.0
> #Degrees of freedom        2
> #Overall p-value           1.64669e-10
>
> Many thanks for any help you may be able to provide.
>
> Eleni Rapsomaniki
> Research Associate
> University of Cambridge
> Institute of Primary and Public Health
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Göran Broström

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: survreg vs. aftreg (eha) - the relationship between fitted coefficients?

Andrew Jackson

Dear Göran

I too have been working on this for the last couple of weeks and this is a
great help thanks. I wonder can you advise on how the covariance matrix of
estimated parameters in phreg relates to those estimated in survreg.

Im hoping you can just put your finger on this. Your eha package has been
really useful so far and if I can get this working then i will be a very
happy person indeed.  Essentially, I am doing all this in order to run
phreg, then convert and pass the results to a survreg object in order to use
the predict function to create survival curves for each stratification with
confidence intervals around them. This is clunky I know, but unless there is
a similar predict method for phreg objects I am stuck, and I havent found
one yet.

What follow is a piece of code ive been working on to explore the
relationships between phreg and survreg outputs using simulated data.

#
------------------------------------------------------------------------------
# START OF SCRIPT EXAMPLE
#
------------------------------------------------------------------------------
rm(list=ls()) # clear the memory
graphics.off() # turn graphics off

set.seed(1) # for repeatability, set the random number seed

# Load the libraries
library(survival)
library(eha)

#
------------------------------------------------------------------------------

n <- 50 # number of observations per stratification
true.shape <- c(3,3,3,3)  # common shape parameter for each category
true.scale <- c(3,5,7,20) # separate scale parameter
X1.code <- c("A","A","B","B") # X1 encoded as A or B
X2.code <- c("P","Q","P","Q") # X2 encoded as P or Q

# Create the response variables and two fixed factor variables
Y <- NULL
X1 <- NULL
X2 <- NULL

# Loop over each combinatin of Factors and simulate some data from a
# weibull distribution
for (i in 1:length(true.shape)){
 
  # simulate the Y data
  tmp <- rweibull(n,true.shape[i],scale=true.scale[i])
  Y <- c(Y,tmp)
 
  # Encode the fixed factors
  X1 <-c (X1,rep(X1.code[i],n))
  X2 <-c (X2,rep(X2.code[i],n))
 
}  # close loop over factors

# X1 and X1 are factors
X1 <- as.factor(X1)
X2 <- as.factor(X2)

# Everyone has status 1 here (i.e. they were all observed to death/failure)
status <- rep(1,length(Y))


#
------------------------------------------------------------------------------
# plot the data
dev.new()
plot(survfit(Surv(Y,status)~X1+X2),"All Data")

# run survreg
m.surv <- survreg(Surv(Y,status)~X1*X2,dist="weibull")

# run phreg
m.ph <- phreg(Surv(Y,status)~X1*X2,dist="weibull")

# Covariance matrix for survreg() method
m.surv$var

# Covariance matrix for phreg() method
m.ph$var

#
------------------------------------------------------------------------------
# And these are the type of plots i want to produce.

new.data <- data.frame(X1=factor("B",levels=levels(X1)),
                        X2=factor("Q",levels=levels(X2)))
prob <- seq(0.0, 1.0, length=200)
temp <- predict(m.surv, type='quantile', p=prob, se.fit=T, newdata=new.data)
matlines(cbind(temp$fit, temp$fit +1.96*temp$se,
                   temp$fit - 1.96*temp$se),  1-prob,
                   lty=c(1,2,2), col=2, lwd=1)

#
------------------------------------------------------------------------------
# END OF SCRIPT EXAMPLE
#
------------------------------------------------------------------------------



As far as I can decipher the following are related:

var(log(shape)) from phreg equates directly to var(log(scale)) from survreg

var(log(scale)) = var(Intercept)

-cov(log(shape),log(scale) = cov(Intercept,log(scale)) # note the minus in
front of the phreg output

I am stuck though now on how the following relate

The variances of the covariates
var(X1B)
var(X2Q)
var(X1B:X2Q)

The covariance of the covariates
cov(X1B,X2Q)
cov(X1B:X2Q,X1B)
cov(X1B:X2Q,X2Q)

The covariance of the covariates with the shape and scale parameters
cov(log(shape),X1B)
cov(log(scale),X1B)
.. and similary with X2Q and X1B:X2Q

I look forward to hearing from you on this.

best wishes
Andrew Jackson

--
Dr Andrew Jackson
Lecturer
School of Natural Sciences
Zoology Building, Trinity College Dublin, Dublin 2, Ireland
Tel. + 353 1 896 2728, Fax. + 353 1 677 8094, Email. [hidden email]
http://www.tcd.ie/Zoology/research/research/theoretical/AndrewJackson.php

--
View this message in context: http://r.789695.n4.nabble.com/survreg-vs-aftreg-eha-the-relationship-between-fitted-coefficients-tp3082263p3219204.html
Sent from the R help mailing list archive at Nabble.com.

        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.