# coxph linear.predictors

9 messages
Open this post in threaded view
|

## coxph linear.predictors

 I would like to be able to construct hazard rates (or unconditional death prob) for many subjects from a given survfit. This will involve adjusting the ( n.event/n.risk) with (coxph object )\$linear.predictors I must be having another silly day as I cannot reproduce the linear predictor: fit <- coxph(Surv(futime, fustat) ~ age, data = ovarian) fit\$linear.predictors[1] [1] 2.612756 coef(fit)*model.matrix(fit)[1,1]      age 11.69021 The above is based on the help listing for coxph.object coefficients: the coefficients of the linear predictor, which multiply           the columns of the model matrix.  If the model is           over-determined there will be missing values in the vector           corresponding to the redundant columns in the model matrix. Also, please comment whether n.event/n.risk gives the baseline hazard exp(alpha) ? Please, help. Thanks a lot. Stephen B         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: coxph linear.predictors

 On Oct 27, 2010, at 12:12 PM, Bond, Stephen wrote: > I would like to be able to construct hazard rates (or unconditional   > death prob) Hazards are not probabilities (since probabilities are constrained to   the range [0,1] and hazards are unbounded upward.) > for many subjects from a given survfit. > > This will involve adjusting the ( n.event/n.risk) > with (coxph object )\$linear.predictors > I must be having another silly day as I cannot reproduce the linear   > predictor: > > fit <- coxph(Surv(futime, fustat) ~ age, data = ovarian) > fit\$linear.predictors[1] > [1] 2.612756 That's the linear predictor (the beta*X) and that particular number   only applies to the first case. > > coef(fit)*model.matrix(fit)[1,1] >     age > 11.69021 > I don't know what that might be and you are not telling us what you   think it is. > The above is based on the help listing for coxph.object > coefficients: the coefficients of the linear predictor, which multiply >          the columns of the model matrix.  If the model is >          over-determined there will be missing values in the vector >          corresponding to the redundant columns in the model matrix. > > Also, please comment whether n.event/n.risk The Nelson-Aalen estimator of the cumulative hazard as a function of   intervals prior to t is sum( n-event(t)/ n.risk(t)) > gives the baseline hazard exp(alpha) ? No. The "baseline hazard", as you are calling this, would be an   estimate for persons with all covariates = 0, so in this case is for   women of age=0. (Not a particularly interpretable result in many   situations. The baseline hazard following treated ovarian cancer for   neonates is not medically sensible.) What is the purpose of this request? Is someone telling you you need   to provide estimates for instantaneous hazards? -- David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: coxph linear.predictors

 >> >> coef(fit)*model.matrix(fit)[1,1] >>     age >> 11.69021 >> > I don't know what that might be and you are not telling us what you   > think it is. I think this the calculation of the linear predictor, multiplying > (the beta*X) I expected that coef(fit)*model.matrix(fit)[1,1]= fit\$linear.predictors[1] And since it does not, I am lost. Ultimately I need unconditional death probability from day 1 to day t in the end. Survfit takes too much time as it does a lot of things I do not need. I hope I should be able to get the death prob for every subject from a single call to survfit and then rescale with exp(linear.predictor) for each subject. Just a comment: as there is not deltat argument, I think that coxph assumes deltat=1 so the hazard is the conditional probability over the next time interval ( t(i-1), t(i) ). With dense data sampled daily this is a day interval. Your comments are appreciated. Stephen Bond -----Original Message----- From: David Winsemius [mailto:[hidden email]] Sent: Wednesday, October 27, 2010 1:15 PM To: Bond, Stephen Cc: [hidden email] Subject: Re: [R] coxph linear.predictors On Oct 27, 2010, at 12:12 PM, Bond, Stephen wrote: > I would like to be able to construct hazard rates (or unconditional   > death prob) Hazards are not probabilities (since probabilities are constrained to   the range [0,1] and hazards are unbounded upward.) > for many subjects from a given survfit. > > This will involve adjusting the ( n.event/n.risk) > with (coxph object )\$linear.predictors > I must be having another silly day as I cannot reproduce the linear   > predictor: > > fit <- coxph(Surv(futime, fustat) ~ age, data = ovarian) > fit\$linear.predictors[1] > [1] 2.612756 That's the linear predictor (the beta*X) and that particular number   only applies to the first case. > > coef(fit)*model.matrix(fit)[1,1] >     age > 11.69021 > I don't know what that might be and you are not telling us what you   think it is. > The above is based on the help listing for coxph.object > coefficients: the coefficients of the linear predictor, which multiply >          the columns of the model matrix.  If the model is >          over-determined there will be missing values in the vector >          corresponding to the redundant columns in the model matrix. > > Also, please comment whether n.event/n.risk The Nelson-Aalen estimator of the cumulative hazard as a function of   intervals prior to t is sum( n-event(t)/ n.risk(t)) > gives the baseline hazard exp(alpha) ? No. The "baseline hazard", as you are calling this, would be an   estimate for persons with all covariates = 0, so in this case is for   women of age=0. (Not a particularly interpretable result in many   situations. The baseline hazard following treated ovarian cancer for   neonates is not medically sensible.) What is the purpose of this request? Is someone telling you you need   to provide estimates for instantaneous hazards? -- David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: coxph linear.predictors

 On Oct 27, 2010, at 1:45 PM, Bond, Stephen wrote: >>> >>> coef(fit)*model.matrix(fit)[1,1] >>>    age >>> 11.69021 >>> > >> I don't know what that might be and you are not telling us what you >> think it is. > > I think this the calculation of the linear predictor, multiplying > >> (the beta*X) > > I expected that coef(fit)*model.matrix(fit)[1,1]= fit > \$linear.predictors[1] > > And since it does not, I am lost. > > Ultimately I need unconditional death probability from day 1 to day   > t in the end. Now it sounds like you do want cap-lambda, the cumulative hazard.   Let's call it Lambda Since S(t) = exp(-Lambda(t)), you can invert to get Lambda(t) = - log(S(t)) ?survfit Possibly: plot(-log( survfit(fit)\$surv )) > Survfit takes too much time as it does a lot of things I do not   > need. I hope I should be able to get the death prob for every   > subject from a single call to survfit and then rescale with   > exp(linear.predictor) for each subject. I have run into slowness with that approach myself. The rms package   provides survest and Predict methods, but if you wanted to stay in   survival, then why not create a survfit()\$surv object based on a   vector of means and then calculate an exp(newdata) by which you would   multiply the S(mean|t)? There is also a predict.coxph function with   its type="risk" parameter that you might want to check. > > Just a comment: as there is not deltat argument, I think that coxph   > assumes deltat=1 so the hazard is the conditional probability over   > the next time interval ( t(i-1), t(i) ). With dense data sampled   > daily this is a day interval. That's not the way I understand it. The Lambda() and S() functions   only change at the particular event times on a continuous time so they   are not assumed to have integer values. Pretty sure that Therneau   avoids presenting extraction methods for instantaneous hazards from   Cox models. In his book "Modeling Survival Data" he mostly sticks to   discussing cumulative hazard functions rather than estimating an   interval or instantaneous hazard. And fortunately he visits rhelp so   it's quite possible that my errors in this area will get expert review   and correction. > > Your comments are appreciated. > > Stephen Bond > > -----Original Message----- > From: David Winsemius [mailto:[hidden email]] > Sent: Wednesday, October 27, 2010 1:15 PM > To: Bond, Stephen > Cc: [hidden email] > Subject: Re: [R] coxph linear.predictors > > > On Oct 27, 2010, at 12:12 PM, Bond, Stephen wrote: > >> I would like to be able to construct hazard rates (or unconditional >> death prob) > > Hazards are not probabilities (since probabilities are constrained to > the range [0,1] and hazards are unbounded upward.) > >> for many subjects from a given survfit. >> >> This will involve adjusting the ( n.event/n.risk) >> with (coxph object )\$linear.predictors >> I must be having another silly day as I cannot reproduce the linear >> predictor: >> >> fit <- coxph(Surv(futime, fustat) ~ age, data = ovarian) >> fit\$linear.predictors[1] >> [1] 2.612756 > > That's the linear predictor (the beta*X) and that particular number > only applies to the first case. > >> >> coef(fit)*model.matrix(fit)[1,1] >>    age >> 11.69021 >> > > I don't know what that might be and you are not telling us what you > think it is. > > >> The above is based on the help listing for coxph.object >> coefficients: the coefficients of the linear predictor, which   >> multiply >>         the columns of the model matrix.  If the model is >>         over-determined there will be missing values in the vector >>         corresponding to the redundant columns in the model matrix. >> >> Also, please comment whether n.event/n.risk > > The Nelson-Aalen estimator of the cumulative hazard as a function of > intervals prior to t is sum( n-event(t)/ n.risk(t)) > >> gives the baseline hazard exp(alpha) ? > > No. The "baseline hazard", as you are calling this, would be an > estimate for persons with all covariates = 0, so in this case is for > women of age=0. (Not a particularly interpretable result in many > situations. The baseline hazard following treated ovarian cancer for > neonates is not medically sensible.) > > What is the purpose of this request? Is someone telling you you need > to provide estimates for instantaneous hazards? > > > -- > > David Winsemius, MD > West Hartford, CT > David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: coxph linear.predictors

 What you suggest is exactly what I intended. The big problem is that X*beta  != linear.predictor  # it works for glm That's a big loose end. The survival documentation on p41 implies that the above should be an equality, but it is not. And I could not produce the surv vector using a variety of formulas, so I guess this is the spirit of SAS disguising itself as an R package. Stephen Bond -----Original Message----- From: David Winsemius [mailto:[hidden email]] Sent: Wednesday, October 27, 2010 2:17 PM To: Bond, Stephen Cc: [hidden email] Subject: Re: [R] coxph linear.predictors On Oct 27, 2010, at 1:45 PM, Bond, Stephen wrote: >>> >>> coef(fit)*model.matrix(fit)[1,1] >>>    age >>> 11.69021 >>> > >> I don't know what that might be and you are not telling us what you >> think it is. > > I think this the calculation of the linear predictor, multiplying > >> (the beta*X) > > I expected that coef(fit)*model.matrix(fit)[1,1]= fit > \$linear.predictors[1] > > And since it does not, I am lost. > > Ultimately I need unconditional death probability from day 1 to day   > t in the end. Now it sounds like you do want cap-lambda, the cumulative hazard.   Let's call it Lambda Since S(t) = exp(-Lambda(t)), you can invert to get Lambda(t) = - log(S(t)) ?survfit Possibly: plot(-log( survfit(fit)\$surv )) > Survfit takes too much time as it does a lot of things I do not   > need. I hope I should be able to get the death prob for every   > subject from a single call to survfit and then rescale with   > exp(linear.predictor) for each subject. I have run into slowness with that approach myself. The rms package   provides survest and Predict methods, but if you wanted to stay in   survival, then why not create a survfit()\$surv object based on a   vector of means and then calculate an exp(newdata) by which you would   multiply the S(mean|t)? There is also a predict.coxph function with   its type="risk" parameter that you might want to check. > > Just a comment: as there is not deltat argument, I think that coxph   > assumes deltat=1 so the hazard is the conditional probability over   > the next time interval ( t(i-1), t(i) ). With dense data sampled   > daily this is a day interval. That's not the way I understand it. The Lambda() and S() functions   only change at the particular event times on a continuous time so they   are not assumed to have integer values. Pretty sure that Therneau   avoids presenting extraction methods for instantaneous hazards from   Cox models. In his book "Modeling Survival Data" he mostly sticks to   discussing cumulative hazard functions rather than estimating an   interval or instantaneous hazard. And fortunately he visits rhelp so   it's quite possible that my errors in this area will get expert review   and correction. > > Your comments are appreciated. > > Stephen Bond > > -----Original Message----- > From: David Winsemius [mailto:[hidden email]] > Sent: Wednesday, October 27, 2010 1:15 PM > To: Bond, Stephen > Cc: [hidden email] > Subject: Re: [R] coxph linear.predictors > > > On Oct 27, 2010, at 12:12 PM, Bond, Stephen wrote: > >> I would like to be able to construct hazard rates (or unconditional >> death prob) > > Hazards are not probabilities (since probabilities are constrained to > the range [0,1] and hazards are unbounded upward.) > >> for many subjects from a given survfit. >> >> This will involve adjusting the ( n.event/n.risk) >> with (coxph object )\$linear.predictors >> I must be having another silly day as I cannot reproduce the linear >> predictor: >> >> fit <- coxph(Surv(futime, fustat) ~ age, data = ovarian) >> fit\$linear.predictors[1] >> [1] 2.612756 > > That's the linear predictor (the beta*X) and that particular number > only applies to the first case. > >> >> coef(fit)*model.matrix(fit)[1,1] >>    age >> 11.69021 >> > > I don't know what that might be and you are not telling us what you > think it is. > > >> The above is based on the help listing for coxph.object >> coefficients: the coefficients of the linear predictor, which   >> multiply >>         the columns of the model matrix.  If the model is >>         over-determined there will be missing values in the vector >>         corresponding to the redundant columns in the model matrix. >> >> Also, please comment whether n.event/n.risk > > The Nelson-Aalen estimator of the cumulative hazard as a function of > intervals prior to t is sum( n-event(t)/ n.risk(t)) > >> gives the baseline hazard exp(alpha) ? > > No. The "baseline hazard", as you are calling this, would be an > estimate for persons with all covariates = 0, so in this case is for > women of age=0. (Not a particularly interpretable result in many > situations. The baseline hazard following treated ovarian cancer for > neonates is not medically sensible.) > > What is the purpose of this request? Is someone telling you you need > to provide estimates for instantaneous hazards? > > > -- > > David Winsemius, MD > West Hartford, CT > David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: coxph linear.predictors

 In reply to this post by David Winsemius I think I found the first piece of it. Typical SAS: > fit\$linear.predictors[1]-coef(fit)*(ovarian\$age[1]-mean(ovarian\$age))               age -4.4408920985e-16 Well misrepresented in several places... Stephen Bond -----Original Message----- From: David Winsemius [mailto:[hidden email]] Sent: Wednesday, October 27, 2010 1:15 PM To: Bond, Stephen Cc: [hidden email] Subject: Re: [R] coxph linear.predictors On Oct 27, 2010, at 12:12 PM, Bond, Stephen wrote: > I would like to be able to construct hazard rates (or unconditional   > death prob) Hazards are not probabilities (since probabilities are constrained to   the range [0,1] and hazards are unbounded upward.) > for many subjects from a given survfit. > > This will involve adjusting the ( n.event/n.risk) > with (coxph object )\$linear.predictors > I must be having another silly day as I cannot reproduce the linear   > predictor: > > fit <- coxph(Surv(futime, fustat) ~ age, data = ovarian) > fit\$linear.predictors[1] > [1] 2.612756 That's the linear predictor (the beta*X) and that particular number   only applies to the first case. > > coef(fit)*model.matrix(fit)[1,1] >     age > 11.69021 > I don't know what that might be and you are not telling us what you   think it is. > The above is based on the help listing for coxph.object > coefficients: the coefficients of the linear predictor, which multiply >          the columns of the model matrix.  If the model is >          over-determined there will be missing values in the vector >          corresponding to the redundant columns in the model matrix. > > Also, please comment whether n.event/n.risk The Nelson-Aalen estimator of the cumulative hazard as a function of   intervals prior to t is sum( n-event(t)/ n.risk(t)) > gives the baseline hazard exp(alpha) ? No. The "baseline hazard", as you are calling this, would be an   estimate for persons with all covariates = 0, so in this case is for   women of age=0. (Not a particularly interpretable result in many   situations. The baseline hazard following treated ovarian cancer for   neonates is not medically sensible.) What is the purpose of this request? Is someone telling you you need   to provide estimates for instantaneous hazards? -- David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: coxph linear.predictors

 In reply to this post by David Winsemius To close the issue: > S.ave <- survfit(fit) # this is average survival > S.1 <- (S.ave\$surv)^exp(fit\$linear.predictors[1]) # get subject 1 > S.1  [1] 0.848450223861993 0.696973203043377 0.530604282995790 0.391136296063732 0.228370373409774 0.132991279576665 0.071742127376216  [8] 0.030988671457923 0.012340713882587 0.004553665545169 0.000865566322530 0.000135746409544 > survfit(fit,new=ovarian[1,])\$surv   # subject 1 from survfit  [1] 0.848450223861993 0.696973203043377 0.530604282995791 0.391136296063732 0.228370373409774 0.132991279576665 0.071742127376216  [8] 0.030988671457923 0.012340713882587 0.004553665545169 0.000865566322530 0.000135746409544 A single call to survfit will suffice for thousands of subjects, avoiding the iterative loop calculations inside survfit. Stephen Bond -----Original Message----- From: David Winsemius [mailto:[hidden email]] Sent: Wednesday, October 27, 2010 1:15 PM To: Bond, Stephen Cc: [hidden email] Subject: Re: [R] coxph linear.predictors On Oct 27, 2010, at 12:12 PM, Bond, Stephen wrote: > I would like to be able to construct hazard rates (or unconditional   > death prob) Hazards are not probabilities (since probabilities are constrained to   the range [0,1] and hazards are unbounded upward.) > for many subjects from a given survfit. > > This will involve adjusting the ( n.event/n.risk) > with (coxph object )\$linear.predictors > I must be having another silly day as I cannot reproduce the linear   > predictor: > > fit <- coxph(Surv(futime, fustat) ~ age, data = ovarian) > fit\$linear.predictors[1] > [1] 2.612756 That's the linear predictor (the beta*X) and that particular number   only applies to the first case. > > coef(fit)*model.matrix(fit)[1,1] >     age > 11.69021 > I don't know what that might be and you are not telling us what you   think it is. > The above is based on the help listing for coxph.object > coefficients: the coefficients of the linear predictor, which multiply >          the columns of the model matrix.  If the model is >          over-determined there will be missing values in the vector >          corresponding to the redundant columns in the model matrix. > > Also, please comment whether n.event/n.risk The Nelson-Aalen estimator of the cumulative hazard as a function of   intervals prior to t is sum( n-event(t)/ n.risk(t)) > gives the baseline hazard exp(alpha) ? No. The "baseline hazard", as you are calling this, would be an   estimate for persons with all covariates = 0, so in this case is for   women of age=0. (Not a particularly interpretable result in many   situations. The baseline hazard following treated ovarian cancer for   neonates is not medically sensible.) What is the purpose of this request? Is someone telling you you need   to provide estimates for instantaneous hazards? -- David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.