Predictions from "coxph" or "cph" objects

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Predictions from "coxph" or "cph" objects

Axel Urbiz
Dear R users,

My apologies for the simple question, as I'm starting to learn the concepts
behind the Cox PH model. I was just experimenting with the survival and rms
packages for this.

I'm simply trying to obtain the expected survival time (as opposed to the
probability of survival at a given time t). I can't seem to find an option
from the "type" argument in the predict methods from coxph{survival} or
cph{rms} that will give me expected survival times.

library(rms)
options(na.action=na.exclude) # retain NA in predictions
fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
fit2 <-  cph(Surv(time, status) ~ age + ph.ecog, lung)
head(predict(fit,type="lp"))
head(predict(fit2,type="lp"))


Thank you.

Regards,
Axel.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Predictions from "coxph" or "cph" objects

David Winsemius

On Jul 5, 2014, at 5:28 AM, Axel Urbiz wrote:

> Dear R users,
>
> My apologies for the simple question, as I'm starting to learn the  
> concepts
> behind the Cox PH model. I was just experimenting with the survival  
> and rms
> packages for this.
>
> I'm simply trying to obtain the expected survival time (as opposed  
> to the
> probability of survival at a given time t).

What does "expected survival time" actually mean? Do you want the  
median survival time?

> I can't seem to find an option
> from the "type" argument in the predict methods from coxph{survival}  
> or
> cph{rms} that will give me expected survival times.
>
> library(rms)
> options(na.action=na.exclude) # retain NA in predictions
> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
> fit2 <-  cph(Surv(time, status) ~ age + ph.ecog, lung)
> head(predict(fit,type="lp"))
> head(predict(fit2,type="lp"))

`predict` will return the results of the regression, i.e. the log-
hazard ratios for each term in the RHS of the formula. What you want  
(as described in the Index for the survival package) is either  
`survfit` or `survexp`.

require(survival)
help(pack=survival)
?survfit
?survexp
?summary.survfit
?quantile.survfit   # to get the median
?print.summary.survfit

require(rms)
help(pack=rms)

The rms-package also adds a `survfit.cph` function but I have found  
the `survest` function also provides useful added features, beyond  
those offered by survfit

>
>
> Thank you.
>
> Regards,
> Axel.
>
> [[alternative HTML version deleted]]

This is a plain text mailing list.

--

David Winsemius, MD
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Predictions from "coxph" or "cph" objects

Axel Urbiz
Thank you David. It is my understanding that using survit below I get the
median predicted survival. I actually was looking for the mean. I can't
seem to find in the documentation how to get that.

options(na.action=na.exclude) # retain NA in predictions
fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
pred <- survfit(fit, newdata=lung)
head(pred)

Thanks again,
Axel.



On Sat, Jul 5, 2014 at 1:54 PM, David Winsemius <[hidden email]>
wrote:

>
> On Jul 5, 2014, at 5:28 AM, Axel Urbiz wrote:
>
>  Dear R users,
>>
>> My apologies for the simple question, as I'm starting to learn the
>> concepts
>> behind the Cox PH model. I was just experimenting with the survival and
>> rms
>> packages for this.
>>
>> I'm simply trying to obtain the expected survival time (as opposed to the
>> probability of survival at a given time t).
>>
>
> What does "expected survival time" actually mean? Do you want the median
> survival time?
>
>
>  I can't seem to find an option
>> from the "type" argument in the predict methods from coxph{survival} or
>> cph{rms} that will give me expected survival times.
>>
>> library(rms)
>> options(na.action=na.exclude) # retain NA in predictions
>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>> fit2 <-  cph(Surv(time, status) ~ age + ph.ecog, lung)
>> head(predict(fit,type="lp"))
>> head(predict(fit2,type="lp"))
>>
>
> `predict` will return the results of the regression, i.e. the log-hazard
> ratios for each term in the RHS of the formula. What you want (as described
> in the Index for the survival package) is either `survfit` or `survexp`.
>
> require(survival)
> help(pack=survival)
> ?survfit
> ?survexp
> ?summary.survfit
> ?quantile.survfit   # to get the median
> ?print.summary.survfit
>
> require(rms)
> help(pack=rms)
>
> The rms-package also adds a `survfit.cph` function but I have found the
> `survest` function also provides useful added features, beyond those
> offered by survfit
>
>
>>
>> Thank you.
>>
>> Regards,
>> Axel.
>>
>>         [[alternative HTML version deleted]]
>>
>
> This is a plain text mailing list.
>
> --
>
> David Winsemius, MD
> Alameda, CA, USA
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Predictions from "coxph" or "cph" objects

David Winsemius

On Jul 5, 2014, at 12:43 PM, Axel Urbiz wrote:

> Thank you David. It is my understanding that using survfirsurvit  
> below I get the median predicted survival. I actually was looking  
> for the mean. I can't seem to find in the documentation how to get  
> that.
>
> options(na.action=na.exclude) # retain NA in predictions
> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
> pred <- survfit(fit, newdata=lung)
> head(pred)
>
There might be a way. I don't know it if so, so I would probably just  
use the definition of the mean:

sum(summary(pred)$surv* summary(pred)$time)/sum(  summary(pred)$time)

(I continue to take effort to keep my postings in plain text despite  
my mail-clients's efforts to match your formatted postings. It adds to  
the work of responders when you post formatted questions and responses.)


> Thanks again,
> Axel.
>
>
>
> On Sat, Jul 5, 2014 at 1:54 PM, David Winsemius <[hidden email]
> > wrote:
>
> On Jul 5, 2014, at 5:28 AM, Axel Urbiz wrote:
>
> Dear R users,
>
> My apologies for the simple question, as I'm starting to learn the  
> concepts
> behind the Cox PH model. I was just experimenting with the survival  
> and rms
> packages for this.
>
> I'm simply trying to obtain the expected survival time (as opposed  
> to the
> probability of survival at a given time t).
>
> What does "expected survival time" actually mean? Do you want the  
> median survival time?
>
>
> I can't seem to find an option
> from the "type" argument in the predict methods from coxph{survival}  
> or
> cph{rms} that will give me expected survival times.
>
> library(rms)
> options(na.action=na.exclude) # retain NA in predictions
> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
> fit2 <-  cph(Surv(time, status) ~ age + ph.ecog, lung)
> head(predict(fit,type="lp"))
> head(predict(fit2,type="lp"))
>
> `predict` will return the results of the regression, i.e. the log-
> hazard ratios for each term in the RHS of the formula. What you want  
> (as described in the Index for the survival package) is either  
> `survfit` or `survexp`.
>
> require(survival)
> help(pack=survival)
> ?survfit
> ?survexp
> ?summary.survfit
> ?quantile.survfit   # to get the median
> ?print.summary.survfit
>
> require(rms)
> help(pack=rms)
>
> The rms-package also adds a `survfit.cph` function but I have found  
> the `survest` function also provides useful added features, beyond  
> those offered by survfit
>
>
>
> Thank you.
>
> Regards,
> Axel.
>
>         [[alternative HTML version deleted]]
>
> This is a plain text mailing list.
>
> --
>
> David Winsemius, MD
> Alameda, CA, USA
>
>

David Winsemius, MD
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Predictions from "coxph" or "cph" objects

David Winsemius

On Jul 5, 2014, at 9:12 PM, David Winsemius wrote:

>
> On Jul 5, 2014, at 12:43 PM, Axel Urbiz wrote:
>
>> Thank you David. It is my understanding that using survfirsurvit  
>> below I get the median predicted survival. I actually was looking  
>> for the mean. I can't seem to find in the documentation how to get  
>> that.
>>
>> options(na.action=na.exclude) # retain NA in predictions
>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>> pred <- survfit(fit, newdata=lung)
>> head(pred)
>>
> There might be a way. I don't know it if so, so I would probably  
> just use the definition of the mean:
>
> sum(summary(pred)$surv* summary(pred)$time)/sum(  summary(pred)$time)
>

Er, I think I meant to type:

fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
pred <- survfit(fit)

  sum(summary(pred)$surv* summary(pred)$time)/sum(  summary(pred)$surv)
[1] 211.0943


> (I continue to take effort to keep my postings in plain text despite  
> my mail-clients's efforts to match your formatted postings. It adds  
> to the work of responders when you post formatted questions and  
> responses.)
>
>
>> Thanks again,
>> Axel.
>>
>>
>>
>> On Sat, Jul 5, 2014 at 1:54 PM, David Winsemius <[hidden email]
>> > wrote:
>>
>> On Jul 5, 2014, at 5:28 AM, Axel Urbiz wrote:
>>
>> Dear R users,
>>
>> My apologies for the simple question, as I'm starting to learn the  
>> concepts
>> behind the Cox PH model. I was just experimenting with the survival  
>> and rms
>> packages for this.
>>
>> I'm simply trying to obtain the expected survival time (as opposed  
>> to the
>> probability of survival at a given time t).
>>
>> What does "expected survival time" actually mean? Do you want the  
>> median survival time?
>>
>>
>> I can't seem to find an option
>> from the "type" argument in the predict methods from  
>> coxph{survival} or
>> cph{rms} that will give me expected survival times.
>>
>> library(rms)
>> options(na.action=na.exclude) # retain NA in predictions
>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>> fit2 <-  cph(Surv(time, status) ~ age + ph.ecog, lung)
>> head(predict(fit,type="lp"))
>> head(predict(fit2,type="lp"))
>>
>> `predict` will return the results of the regression, i.e. the log-
>> hazard ratios for each term in the RHS of the formula. What you  
>> want (as described in the Index for the survival package) is either  
>> `survfit` or `survexp`.
>>
>> require(survival)
>> help(pack=survival)
>> ?survfit
>> ?survexp
>> ?summary.survfit
>> ?quantile.survfit   # to get the median
>> ?print.summary.survfit
>>
>> require(rms)
>> help(pack=rms)
>>
>> The rms-package also adds a `survfit.cph` function but I have found  
>> the `survest` function also provides useful added features, beyond  
>> those offered by survfit
>>
>>
>>
>> Thank you.
>>
>> Regards,
>> Axel.
>>
>>        [[alternative HTML version deleted]]
>>
>> This is a plain text mailing list.
>>
>> --
>>
>> David Winsemius, MD
>> Alameda, CA, USA
>>
>>
>
> David Winsemius, MD
> Alameda, CA, USA
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Predictions from "coxph" or "cph" objects

Göran Broström-3
David and Axel,

I have two comments to your discussion:

(i) The area under the survival curve is equal to the mean of the
distribution, so the estimate of the mean should be the sum of the areas
of the rectangles defined by the estimated survival curve and the
successive distances between observed event times.

Thus,

 > surv <- pred$surv
 > time <- pred$time
 > sum(surv * diff(time))

should give you the (estimated) mean). (Note that time[1] == 0, and
length(time) == length(surv) + 1)

(I do not think that David's suggestion gives the same answer, but I may
be wrong.)

(ii) With censored data, this may be a bad idea. For instance, when the
largest observation is a censoring time, you may badly underestimate the
mean. Your best hope is to be able to estimate a conditional mean of the
type E(T | T < x).

This is essentially a non-parametric situation, and therefore it is
better to stick to medians and quantiles.

Göran Broström

On 2014-07-06 06:17, David Winsemius wrote:

>
> On Jul 5, 2014, at 9:12 PM, David Winsemius wrote:
>
>>
>> On Jul 5, 2014, at 12:43 PM, Axel Urbiz wrote:
>>
>>> Thank you David. It is my understanding that using survfirsurvit
>>> below I get the median predicted survival. I actually was looking
>>> for the mean. I can't seem to find in the documentation how to get
>>> that.
>>>
>>> options(na.action=na.exclude) # retain NA in predictions
>>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>>> pred <- survfit(fit, newdata=lung)
>>> head(pred)
>>>
>> There might be a way. I don't know it if so, so I would probably
>> just use the definition of the mean:
>>
>> sum(summary(pred)$surv* summary(pred)$time)/sum(  summary(pred)$time)
>>
>
> Er, I think I meant to type:
>
> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
> pred <- survfit(fit)
>
>    sum(summary(pred)$surv* summary(pred)$time)/sum(  summary(pred)$surv)
> [1] 211.0943
>
>
>> (I continue to take effort to keep my postings in plain text despite
>> my mail-clients's efforts to match your formatted postings. It adds
>> to the work of responders when you post formatted questions and
>> responses.)
>>
>>
>>> Thanks again,
>>> Axel.
>>>
>>>
>>>
>>> On Sat, Jul 5, 2014 at 1:54 PM, David Winsemius <[hidden email]
>>>> wrote:
>>>
>>> On Jul 5, 2014, at 5:28 AM, Axel Urbiz wrote:
>>>
>>> Dear R users,
>>>
>>> My apologies for the simple question, as I'm starting to learn the
>>> concepts
>>> behind the Cox PH model. I was just experimenting with the survival
>>> and rms
>>> packages for this.
>>>
>>> I'm simply trying to obtain the expected survival time (as opposed
>>> to the
>>> probability of survival at a given time t).
>>>
>>> What does "expected survival time" actually mean? Do you want the
>>> median survival time?
>>>
>>>
>>> I can't seem to find an option
>>> from the "type" argument in the predict methods from
>>> coxph{survival} or
>>> cph{rms} that will give me expected survival times.
>>>
>>> library(rms)
>>> options(na.action=na.exclude) # retain NA in predictions
>>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>>> fit2 <-  cph(Surv(time, status) ~ age + ph.ecog, lung)
>>> head(predict(fit,type="lp"))
>>> head(predict(fit2,type="lp"))
>>>
>>> `predict` will return the results of the regression, i.e. the log-
>>> hazard ratios for each term in the RHS of the formula. What you
>>> want (as described in the Index for the survival package) is either
>>> `survfit` or `survexp`.
>>>
>>> require(survival)
>>> help(pack=survival)
>>> ?survfit
>>> ?survexp
>>> ?summary.survfit
>>> ?quantile.survfit   # to get the median
>>> ?print.summary.survfit
>>>
>>> require(rms)
>>> help(pack=rms)
>>>
>>> The rms-package also adds a `survfit.cph` function but I have found
>>> the `survest` function also provides useful added features, beyond
>>> those offered by survfit
>>>
>>>
>>>
>>> Thank you.
>>>
>>> Regards,
>>> Axel.
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> This is a plain text mailing list.
>>>
>>> --
>>>
>>> David Winsemius, MD
>>> Alameda, CA, USA
>>>
>>>
>>
>> David Winsemius, MD
>> Alameda, CA, USA
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Alameda, CA, USA
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Predictions from "coxph" or "cph" objects

Göran Broström-3
On 2014-07-06 10:48, Göran Broström wrote:

> David and Axel,
>
> I have two comments to your discussion:
>
> (i) The area under the survival curve is equal to the mean of the
> distribution, so the estimate of the mean should be the sum of the areas
> of the rectangles defined by the estimated survival curve and the
> successive distances between observed event times.
>
> Thus,
>
>  > surv <- pred$surv
>  > time <- pred$time
>  > sum(surv * diff(time))
>
> should give you the (estimated) mean). (Note that time[1] == 0, and
> length(time) == length(surv) + 1)

Well, this is not quite true; on the first interval the survival curve
is one, so you need to

 > surv <- c(1, surv)

first. But then the lengths of the surv and time vectors do not match so
you need to add a (large) time at the end of time. If the largest
observation is an event, 'no problem' (surv is zero), but otherwise ...

Btw, I tried

 > exit <- rexp(10)
 > event <- rep(1, 10)
 > fit <- coxph(Surv(exit, event) ~ 1)

 > survfit(fit)$surv
  [1] 0.90483742 0.80968410 0.71454371 0.61942215 0.52432953 0.42928471
  [7] 0.33432727 0.23955596 0.14529803 0.05345216

 > survfit(Surv(exit, event) ~ 1)$surv
[1] 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

so be careful ...

Göran

>
> (I do not think that David's suggestion gives the same answer, but I may
> be wrong.)
>
> (ii) With censored data, this may be a bad idea. For instance, when the
> largest observation is a censoring time, you may badly underestimate the
> mean. Your best hope is to be able to estimate a conditional mean of the
> type E(T | T < x).
>
> This is essentially a non-parametric situation, and therefore it is
> better to stick to medians and quantiles.
>
> Göran Broström
>
> On 2014-07-06 06:17, David Winsemius wrote:
>>
>> On Jul 5, 2014, at 9:12 PM, David Winsemius wrote:
>>
>>>
>>> On Jul 5, 2014, at 12:43 PM, Axel Urbiz wrote:
>>>
>>>> Thank you David. It is my understanding that using survfirsurvit
>>>> below I get the median predicted survival. I actually was looking
>>>> for the mean. I can't seem to find in the documentation how to get
>>>> that.
>>>>
>>>> options(na.action=na.exclude) # retain NA in predictions
>>>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>>>> pred <- survfit(fit, newdata=lung)
>>>> head(pred)
>>>>
>>> There might be a way. I don't know it if so, so I would probably
>>> just use the definition of the mean:
>>>
>>> sum(summary(pred)$surv* summary(pred)$time)/sum(  summary(pred)$time)
>>>
>>
>> Er, I think I meant to type:
>>
>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>> pred <- survfit(fit)
>>
>>    sum(summary(pred)$surv* summary(pred)$time)/sum(  summary(pred)$surv)
>> [1] 211.0943
>>
>>
>>> (I continue to take effort to keep my postings in plain text despite
>>> my mail-clients's efforts to match your formatted postings. It adds
>>> to the work of responders when you post formatted questions and
>>> responses.)
>>>
>>>
>>>> Thanks again,
>>>> Axel.
>>>>
>>>>
>>>>
>>>> On Sat, Jul 5, 2014 at 1:54 PM, David Winsemius <[hidden email]
>>>>> wrote:
>>>>
>>>> On Jul 5, 2014, at 5:28 AM, Axel Urbiz wrote:
>>>>
>>>> Dear R users,
>>>>
>>>> My apologies for the simple question, as I'm starting to learn the
>>>> concepts
>>>> behind the Cox PH model. I was just experimenting with the survival
>>>> and rms
>>>> packages for this.
>>>>
>>>> I'm simply trying to obtain the expected survival time (as opposed
>>>> to the
>>>> probability of survival at a given time t).
>>>>
>>>> What does "expected survival time" actually mean? Do you want the
>>>> median survival time?
>>>>
>>>>
>>>> I can't seem to find an option
>>>> from the "type" argument in the predict methods from
>>>> coxph{survival} or
>>>> cph{rms} that will give me expected survival times.
>>>>
>>>> library(rms)
>>>> options(na.action=na.exclude) # retain NA in predictions
>>>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>>>> fit2 <-  cph(Surv(time, status) ~ age + ph.ecog, lung)
>>>> head(predict(fit,type="lp"))
>>>> head(predict(fit2,type="lp"))
>>>>
>>>> `predict` will return the results of the regression, i.e. the log-
>>>> hazard ratios for each term in the RHS of the formula. What you
>>>> want (as described in the Index for the survival package) is either
>>>> `survfit` or `survexp`.
>>>>
>>>> require(survival)
>>>> help(pack=survival)
>>>> ?survfit
>>>> ?survexp
>>>> ?summary.survfit
>>>> ?quantile.survfit   # to get the median
>>>> ?print.summary.survfit
>>>>
>>>> require(rms)
>>>> help(pack=rms)
>>>>
>>>> The rms-package also adds a `survfit.cph` function but I have found
>>>> the `survest` function also provides useful added features, beyond
>>>> those offered by survfit
>>>>
>>>>
>>>>
>>>> Thank you.
>>>>
>>>> Regards,
>>>> Axel.
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> This is a plain text mailing list.
>>>>
>>>> --
>>>>
>>>> David Winsemius, MD
>>>> Alameda, CA, USA
>>>>
>>>>
>>>
>>> David Winsemius, MD
>>> Alameda, CA, USA
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius, MD
>> Alameda, CA, USA
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Predictions from "coxph" or "cph" objects

Axel Urbiz
many thanks all for this discussion. It was very helpful.

Best,
Axel.


On Sun, Jul 6, 2014 at 5:17 AM, Göran Broström <[hidden email]>
wrote:

> On 2014-07-06 10:48, Göran Broström wrote:
>
>> David and Axel,
>>
>> I have two comments to your discussion:
>>
>> (i) The area under the survival curve is equal to the mean of the
>> distribution, so the estimate of the mean should be the sum of the areas
>> of the rectangles defined by the estimated survival curve and the
>> successive distances between observed event times.
>>
>> Thus,
>>
>>  > surv <- pred$surv
>>  > time <- pred$time
>>  > sum(surv * diff(time))
>>
>> should give you the (estimated) mean). (Note that time[1] == 0, and
>> length(time) == length(surv) + 1)
>>
>
> Well, this is not quite true; on the first interval the survival curve is
> one, so you need to
>
> > surv <- c(1, surv)
>
> first. But then the lengths of the surv and time vectors do not match so
> you need to add a (large) time at the end of time. If the largest
> observation is an event, 'no problem' (surv is zero), but otherwise ...
>
> Btw, I tried
>
> > exit <- rexp(10)
> > event <- rep(1, 10)
> > fit <- coxph(Surv(exit, event) ~ 1)
>
> > survfit(fit)$surv
>  [1] 0.90483742 0.80968410 0.71454371 0.61942215 0.52432953 0.42928471
>  [7] 0.33432727 0.23955596 0.14529803 0.05345216
>
> > survfit(Surv(exit, event) ~ 1)$surv
> [1] 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
>
> so be careful ...
>
> Göran
>
>
>
>> (I do not think that David's suggestion gives the same answer, but I may
>> be wrong.)
>>
>> (ii) With censored data, this may be a bad idea. For instance, when the
>> largest observation is a censoring time, you may badly underestimate the
>> mean. Your best hope is to be able to estimate a conditional mean of the
>> type E(T | T < x).
>>
>> This is essentially a non-parametric situation, and therefore it is
>> better to stick to medians and quantiles.
>>
>> Göran Broström
>>
>> On 2014-07-06 06:17, David Winsemius wrote:
>>
>>>
>>> On Jul 5, 2014, at 9:12 PM, David Winsemius wrote:
>>>
>>>
>>>> On Jul 5, 2014, at 12:43 PM, Axel Urbiz wrote:
>>>>
>>>>  Thank you David. It is my understanding that using survfirsurvit
>>>>> below I get the median predicted survival. I actually was looking
>>>>> for the mean. I can't seem to find in the documentation how to get
>>>>> that.
>>>>>
>>>>> options(na.action=na.exclude) # retain NA in predictions
>>>>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>>>>> pred <- survfit(fit, newdata=lung)
>>>>> head(pred)
>>>>>
>>>>>  There might be a way. I don't know it if so, so I would probably
>>>> just use the definition of the mean:
>>>>
>>>> sum(summary(pred)$surv* summary(pred)$time)/sum(  summary(pred)$time)
>>>>
>>>>
>>> Er, I think I meant to type:
>>>
>>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>>> pred <- survfit(fit)
>>>
>>>    sum(summary(pred)$surv* summary(pred)$time)/sum(  summary(pred)$surv)
>>> [1] 211.0943
>>>
>>>
>>>  (I continue to take effort to keep my postings in plain text despite
>>>> my mail-clients's efforts to match your formatted postings. It adds
>>>> to the work of responders when you post formatted questions and
>>>> responses.)
>>>>
>>>>
>>>>  Thanks again,
>>>>> Axel.
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Jul 5, 2014 at 1:54 PM, David Winsemius <
>>>>> [hidden email]
>>>>>
>>>>>> wrote:
>>>>>>
>>>>>
>>>>> On Jul 5, 2014, at 5:28 AM, Axel Urbiz wrote:
>>>>>
>>>>> Dear R users,
>>>>>
>>>>> My apologies for the simple question, as I'm starting to learn the
>>>>> concepts
>>>>> behind the Cox PH model. I was just experimenting with the survival
>>>>> and rms
>>>>> packages for this.
>>>>>
>>>>> I'm simply trying to obtain the expected survival time (as opposed
>>>>> to the
>>>>> probability of survival at a given time t).
>>>>>
>>>>> What does "expected survival time" actually mean? Do you want the
>>>>> median survival time?
>>>>>
>>>>>
>>>>> I can't seem to find an option
>>>>> from the "type" argument in the predict methods from
>>>>> coxph{survival} or
>>>>> cph{rms} that will give me expected survival times.
>>>>>
>>>>> library(rms)
>>>>> options(na.action=na.exclude) # retain NA in predictions
>>>>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>>>>> fit2 <-  cph(Surv(time, status) ~ age + ph.ecog, lung)
>>>>> head(predict(fit,type="lp"))
>>>>> head(predict(fit2,type="lp"))
>>>>>
>>>>> `predict` will return the results of the regression, i.e. the log-
>>>>> hazard ratios for each term in the RHS of the formula. What you
>>>>> want (as described in the Index for the survival package) is either
>>>>> `survfit` or `survexp`.
>>>>>
>>>>> require(survival)
>>>>> help(pack=survival)
>>>>> ?survfit
>>>>> ?survexp
>>>>> ?summary.survfit
>>>>> ?quantile.survfit   # to get the median
>>>>> ?print.summary.survfit
>>>>>
>>>>> require(rms)
>>>>> help(pack=rms)
>>>>>
>>>>> The rms-package also adds a `survfit.cph` function but I have found
>>>>> the `survest` function also provides useful added features, beyond
>>>>> those offered by survfit
>>>>>
>>>>>
>>>>>
>>>>> Thank you.
>>>>>
>>>>> Regards,
>>>>> Axel.
>>>>>
>>>>>         [[alternative HTML version deleted]]
>>>>>
>>>>> This is a plain text mailing list.
>>>>>
>>>>> --
>>>>>
>>>>> David Winsemius, MD
>>>>> Alameda, CA, USA
>>>>>
>>>>>
>>>>>
>>>> David Winsemius, MD
>>>> Alameda, CA, USA
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> David Winsemius, MD
>>> Alameda, CA, USA
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Predictions from "coxph" or "cph" objects

Frank Harrell
In reply to this post by Axel Urbiz
When using cph in the rms package there is a function Mean that operates  
on cph objects to produce an R function for computing the mean or  
restricted mean life time.

Frank

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Frank Harrell
Department of Biostatistics, Vanderbilt University
Reply | Threaded
Open this post in threaded view
|

Re: Predictions from "coxph" or "cph" objects

Therneau, Terry M., Ph.D.
In reply to this post by Axel Urbiz
I've been off on vacation for a few days and so am arriving late to this discussion.

  Try ?print.survfit, and look at the print.rmean option and the discussion thereof in the
"Details" section of the page.  It will answer your question, in more detail than you
asked.  The option applies to survival curves from coxph fits as well.

Short summary
1. For any positive random variable X, mean(x) = integral from 0 to inf of the survival
curve.  For some reason I found this particular homework problem impossible, back when I
was a new grad student, consequently I remember it well.

2. When there is censoring the survival curve never drops to zero, which makes the full
integral not evaluable.  There are well known responses to this problem, but the mean
survival is used by so few that these "well known" approaches are only known by a small
few. Enough people got confused by the resulting truncated mean that I set the default
option for print.rmean to "don't print it".  The downside is that the few who do want a
mean (like you) often have trouble discovering how to obtain it.

A reference manual (with index) for the survival package is sorely needed.  Someday...

Terry Therneau





Dear R users,

My apologies for the simple question, as I'm starting to learn the concepts
behind the Cox PH model. I was just experimenting with the survival and rms
packages for this.

I'm simply trying to obtain the expected survival time (as opposed to the
probability of survival at a given time t). I can't seem to find an option
from the "type" argument in the predict methods from coxph{survival} or
cph{rms} that will give me expected survival times.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Predictions from "coxph" or "cph" objects

Göran Broström-3
In reply to this post by Göran Broström-3
On 2014-07-06 11:17, Göran Broström wrote:

> On 2014-07-06 10:48, Göran Broström wrote:
>> David and Axel,
>>
>> I have two comments to your discussion:
>>
>> (i) The area under the survival curve is equal to the mean of the
>> distribution, so the estimate of the mean should be the sum of the areas
>> of the rectangles defined by the estimated survival curve and the
>> successive distances between observed event times.
>>
>> Thus,
>>
>>   > surv <- pred$surv
>>   > time <- pred$time
>>   > sum(surv * diff(time))
>>
>> should give you the (estimated) mean). (Note that time[1] == 0, and
>> length(time) == length(surv) + 1)
>
> Well, this is not quite true; on the first interval the survival curve
> is one, so you need to
>
>   > surv <- c(1, surv)
>
> first. But then the lengths of the surv and time vectors do not match so
> you need to add a (large) time at the end of time. If the largest
> observation is an event, 'no problem' (surv is zero), but otherwise ...
>
> Btw, I tried
>
>   > exit <- rexp(10)
>   > event <- rep(1, 10)
>   > fit <- coxph(Surv(exit, event) ~ 1)
>
>   > survfit(fit)$surv
>    [1] 0.90483742 0.80968410 0.71454371 0.61942215 0.52432953 0.42928471
>    [7] 0.33432727 0.23955596 0.14529803 0.05345216
>
>   > survfit(Surv(exit, event) ~ 1)$surv
> [1] 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
>
> so be careful ...

Addendum: Note the argument 'type':

 > survfit(fit, type = "kalbfleisch-prentice")$surv
  [1] 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

This gives, I think (at least if no ties), the nonparametric maximum
likelihood estimator (NPMLE) of the baseline survivor function in the PH
model, and thus coincides with the Kaplan-Meier estimator in the case of
no covariates. It drops down to zero if the largest observation is a
failure. See Kalbfleisch & Prentice (1980), pp. 84-86.

I suppose that the default type ( = "efron") simply uses the formula
S(t) = exp(-H(t)) (on an estimator of H(t)), which never drops down to
zero, censorings or no censorings. This relation does not hold for
discrete-time distributions, and even if our underlying model is
continuous, the resulting non-parametric estimator of H(t) define a
discrete-time distribution, which causes a small dilemma for the purist.
(Should 'type = "kalbfleisch-prentice"' be the default in survfit?)

However, in practice this is of no importance at all. If you stick to
the median and quantiles.

Göran

>
> Göran
>
>>
>> (I do not think that David's suggestion gives the same answer, but I may
>> be wrong.)
>>
>> (ii) With censored data, this may be a bad idea. For instance, when the
>> largest observation is a censoring time, you may badly underestimate the
>> mean. Your best hope is to be able to estimate a conditional mean of the
>> type E(T | T < x).
>>
>> This is essentially a non-parametric situation, and therefore it is
>> better to stick to medians and quantiles.
>>
>> Göran Broström
>>
>> On 2014-07-06 06:17, David Winsemius wrote:
>>>
>>> On Jul 5, 2014, at 9:12 PM, David Winsemius wrote:
>>>
>>>>
>>>> On Jul 5, 2014, at 12:43 PM, Axel Urbiz wrote:
>>>>
>>>>> Thank you David. It is my understanding that using survfirsurvit
>>>>> below I get the median predicted survival. I actually was looking
>>>>> for the mean. I can't seem to find in the documentation how to get
>>>>> that.
>>>>>
>>>>> options(na.action=na.exclude) # retain NA in predictions
>>>>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>>>>> pred <- survfit(fit, newdata=lung)
>>>>> head(pred)
>>>>>
>>>> There might be a way. I don't know it if so, so I would probably
>>>> just use the definition of the mean:
>>>>
>>>> sum(summary(pred)$surv* summary(pred)$time)/sum(  summary(pred)$time)
>>>>
>>>
>>> Er, I think I meant to type:
>>>
>>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>>> pred <- survfit(fit)
>>>
>>>     sum(summary(pred)$surv* summary(pred)$time)/sum(  summary(pred)$surv)
>>> [1] 211.0943
>>>
>>>
>>>> (I continue to take effort to keep my postings in plain text despite
>>>> my mail-clients's efforts to match your formatted postings. It adds
>>>> to the work of responders when you post formatted questions and
>>>> responses.)
>>>>
>>>>
>>>>> Thanks again,
>>>>> Axel.
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Jul 5, 2014 at 1:54 PM, David Winsemius <[hidden email]
>>>>>> wrote:
>>>>>
>>>>> On Jul 5, 2014, at 5:28 AM, Axel Urbiz wrote:
>>>>>
>>>>> Dear R users,
>>>>>
>>>>> My apologies for the simple question, as I'm starting to learn the
>>>>> concepts
>>>>> behind the Cox PH model. I was just experimenting with the survival
>>>>> and rms
>>>>> packages for this.
>>>>>
>>>>> I'm simply trying to obtain the expected survival time (as opposed
>>>>> to the
>>>>> probability of survival at a given time t).
>>>>>
>>>>> What does "expected survival time" actually mean? Do you want the
>>>>> median survival time?
>>>>>
>>>>>
>>>>> I can't seem to find an option
>>>>> from the "type" argument in the predict methods from
>>>>> coxph{survival} or
>>>>> cph{rms} that will give me expected survival times.
>>>>>
>>>>> library(rms)
>>>>> options(na.action=na.exclude) # retain NA in predictions
>>>>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>>>>> fit2 <-  cph(Surv(time, status) ~ age + ph.ecog, lung)
>>>>> head(predict(fit,type="lp"))
>>>>> head(predict(fit2,type="lp"))
>>>>>
>>>>> `predict` will return the results of the regression, i.e. the log-
>>>>> hazard ratios for each term in the RHS of the formula. What you
>>>>> want (as described in the Index for the survival package) is either
>>>>> `survfit` or `survexp`.
>>>>>
>>>>> require(survival)
>>>>> help(pack=survival)
>>>>> ?survfit
>>>>> ?survexp
>>>>> ?summary.survfit
>>>>> ?quantile.survfit   # to get the median
>>>>> ?print.summary.survfit
>>>>>
>>>>> require(rms)
>>>>> help(pack=rms)
>>>>>
>>>>> The rms-package also adds a `survfit.cph` function but I have found
>>>>> the `survest` function also provides useful added features, beyond
>>>>> those offered by survfit
>>>>>
>>>>>
>>>>>
>>>>> Thank you.
>>>>>
>>>>> Regards,
>>>>> Axel.
>>>>>
>>>>>          [[alternative HTML version deleted]]
>>>>>
>>>>> This is a plain text mailing list.
>>>>>
>>>>> --
>>>>>
>>>>> David Winsemius, MD
>>>>> Alameda, CA, USA
>>>>>
>>>>>
>>>>
>>>> David Winsemius, MD
>>>> Alameda, CA, USA
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> David Winsemius, MD
>>> Alameda, CA, USA
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.