Is the output of survfit.coxph survival or baseline survival?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Is the output of survfit.coxph survival or baseline survival?

koshihaku
Dear all,
I am confused with the output of survfit.coxph.
Someone said that the survival given by summary(survfit.coxph) is the baseline survival S_0, but some said that is the survival S=S_0^exp{beta*x}.

Which one is correct?

By the way, if I use "newdata=" in the survfit, does that mean the survival is estimated by the value of covariates in the new data frame?

Thank you very much!

Koshihaku
Reply | Threaded
Open this post in threaded view
|

Re: Is the output of survfit.coxph survival or baseline survival?

David Winsemius

On Sep 30, 2011, at 9:31 PM, koshihaku wrote:

> Dear all,
> I am confused with the output of survfit.coxph.
> Someone said that the survival given by summary(survfit.coxph) is the
> baseline survival S_0, but some said that is the survival  
> S=S_0^exp{beta*x}.
>
> Which one is correct?

It may depend on who _some_ and _someone_ mean by S_0 and who they  
are. I have in the past posted erroneous answers, but the name on  
which to search the archives is 'Terry Therneau'. My current  
understanding is that the survival S_0 is the estimated survival for a  
hypothetical subject whose continuous and discrete covariates are all  
at their means. (But I have been wrong before.) Here is some of what  
Therneau has said about it:

http://finzi.psych.upenn.edu/Rhelp10/2010-October/257941.html
http://finzi.psych.upenn.edu/Rhelp10/2009-March/190341.html
http://finzi.psych.upenn.edu/Rhelp10/2009-February/189768.html

>
> By the way, if I use "newdata=" in the survfit, does that mean the  
> survival
> is estimated by the value of covariates in the new data frame?

In one sense yes, but in another sense, no. If you have a cox fit and  
you  supply newdata, the beta estimates and the baseline survival come  
from in the original data. If you just give it a formula, then there  
is no newdata argument, only a data argument.

Try this:
  fit <- coxph( Surv(futime, fustat)~rx, data=ovarian)
  plot( survfit(fit, newdata=data.frame(rx=1) ) )
  plot( survfit( Surv(futime, fustat)~rx, data=ovarian) )

Then flipping back and forth between those curves might clarify, at  
least to the extent that I understand this question.

And here's a pathological extrapolation:

  plot(survfit(fit, newdata=data.frame(rx=1:3)))

# There is no rx=3 in the original data but it wasn't defined as a  
factor when given to coxph.
# Just checked to see if you could extrapolate past the end of a range  
of factors and very sensibly you cannot.
 > fit <- coxph( Surv(futime, fustat)~factor(rx), data=ovarian)
 > plot(survfit(fit, newdata=data.frame(rx=1:3)))
Error in model.frame.default(data = data.frame(rx = 1:3), formula =  
~factor(rx),  :
   factor 'factor(rx)' has new level(s) 3


--
David.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Is the output of survfit.coxph survival or baseline survival?

Thomas Lumley-2
In reply to this post by koshihaku
On Sat, Oct 1, 2011 at 2:31 PM, koshihaku <[hidden email]> wrote:
> Dear all,
> I am confused with the output of survfit.coxph.
> Someone said that the survival given by summary(survfit.coxph) is the
> baseline survival S_0, but some said that is the survival S=S_0^exp{beta*x}.
>
> Which one is correct?

The baseline hazard as estimated in survfit.coxph is the hazard when
all covariates are equal to the sample mean (or the stratum mean for a
stratified model).   The means that it is using are available in the
$means component of the coxph object.   It is not the hazard
extrapolated to all covariates equal zero.

The centering at the sample mean is done for three reasons
1/ it's computationally convenient
2/ it's numerically more stable
3/ it makes the baseline hazard more interpretable, since at least it
is the hazard for a set of covariate values somewhere in the interior
of your data.

   -thomas

--
Thomas Lumley
Professor of Biostatistics
University of Auckland

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Is the output of survfit.coxph survival or baseline survival?

Therneau, Terry M., Ph.D.
In reply to this post by koshihaku
> Dear all,
> I am confused with the output of survfit.coxph.
> Someone said that the survival given by summary(survfit.coxph) is the
> baseline survival S_0, but some said that is the survival
> S=S_0^exp{beta*x}.
>
> Which one is correct?

 The ³baseline survival², which is the survival for a hypothetical subject
with all covariates=0, may be useful mathematical shorthand when writing a
book but I cannot think of a single case where the resulting curve would be
of any practical interest in medical data.  For this reason my survival
routines in R NEVER return it.  (Ask yourself ³what is the survival for
someone with blood pressure=0, cholesterol=0, weight=0, ....².  The answer
is that they are either non-existent or dead).
 The intention with survfit is that you will give it a second data set
containing one or more lines, each of which describes a subject whose
predicted survival is of interest.  If no such data is given, the survival
for someone with all covariates = to the mean is given.  This is better than
covariates =0, but sometimes not by much.  (What if sex were coded as a 0/1
numeric ‹ do we get the survival of a hermaphrodite?)

Your best approach is to forget the phrase ³baseline survival² and focus on
covariate sets of interest to you.

Terry Therneau

        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Is the output of survfit.coxph survival or baseline survival?

koshihaku
In reply to this post by koshihaku
Dear all,
Your advices was a great help to my study.Thank you very much!