# Survival::coxph (clogit), survConcordance vs. summary(fit) concordance

5 messages
Open this post in threaded view
|

## Survival::coxph (clogit), survConcordance vs. summary(fit) concordance

 Hi, I'm running conditional logistic regression with survival::clogit. I have "1-1 case-control" data, i.e., there is 1 case and 1 control in each strata. Model: fit <- clogit(resp ~ x1 + x2, strata(ID), cluster(site), method ="efron", data = dat) Where resp is 1's and 0's, and x1 and x2 are both continuous. Predictors are both significant. A snippet of summary(fit): Concordance= 0.763  (se = 0.5 ) Rsquare= 0.304   (max possible= 0.5 ) Likelihood ratio test= 27.54  on 2 df,   p=1.047e-06 Wald test            = 17.19  on 2 df,   p=0.0001853 Score (logrank) test = 17.43  on 2 df,   p=0.0001644,   Robust = 6.66  p=0.03574 The concordance estimate seems good but the SE is HUGE. I get a very different estimate from the survConcordance function, which I know says computes concordance for a "single continuous covariate", but it runs on my model with 2 continuous covariates.... survConcordance(Surv(rep(1, 76L), resp) ~ predict(fit), dat) n= 76 Concordance= 0.9106648 se= 0.09365047 concordant  discordant   tied.risk   tied.time    std(c-d)  1315.0000   129.0000     0.0000   703.0000   270.4626 Are both of these concordance estimates valid but providing different information? Is one more appropriate for measuring "performance" (in the AUC sense) of conditional logistic models? Is it possible that the HUGE SE estimate represents a convergence problem (no warnings were thrown when fit the model), or is this model just useless? Thanks! -- Cooperative Fish and Wildlife Research Unit Zoology and Physiology Dept. University of Wyoming [hidden email] / 914.707.8506 wyocoopunit.org         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Survival::coxph (clogit), survConcordance vs. summary(fit) concordance

 I only get the digest, sorry if this has already been answered. When I run your code (after creating some data) I get a warning that "weights are ignored in clogit".  This is a result of miscalling the clogit function.  The first 2 commas should be +s. library(survival) nn <- 1000 dat <- data.frame(resp = rbinom(nn, 1, 0.5), x1=rnorm(nn), x2=rnorm(nn), ID = rep(seq(nn/2), e=2), site = rep(seq(nn/10), e=10)) fit <- clogit(resp ~ x1 + x2, strata(ID), cluster(site), method ="efron", data = dat) # warning fit <- clogit(resp ~ x1 + x2 + strata(ID) + cluster(site), method ="efron", data = dat) # no warning summary(fit) Chris -----Original Message----- From: Joe Ceradini [mailto:[hidden email]] Sent: Tuesday, January 19, 2016 12:48 PM To: [hidden email] Subject: [R] Survival::coxph (clogit), survConcordance vs. summary(fit) concordance Hi, I'm running conditional logistic regression with survival::clogit. I have "1-1 case-control" data, i.e., there is 1 case and 1 control in each strata. Model: fit <- clogit(resp ~ x1 + x2, strata(ID), cluster(site), method ="efron", data = dat) Where resp is 1's and 0's, and x1 and x2 are both continuous. Predictors are both significant. A snippet of summary(fit): Concordance= 0.763  (se = 0.5 ) Rsquare= 0.304   (max possible= 0.5 ) Likelihood ratio test= 27.54  on 2 df,   p=1.047e-06 Wald test            = 17.19  on 2 df,   p=0.0001853 Score (logrank) test = 17.43  on 2 df,   p=0.0001644,   Robust = 6.66  p=0.03574 The concordance estimate seems good but the SE is HUGE. I get a very different estimate from the survConcordance function, which I know says computes concordance for a "single continuous covariate", but it runs on my model with 2 continuous covariates.... survConcordance(Surv(rep(1, 76L), resp) ~ predict(fit), dat) n= 76 Concordance= 0.9106648 se= 0.09365047 concordant  discordant   tied.risk   tied.time    std(c-d)  1315.0000   129.0000     0.0000   703.0000   270.4626 Are both of these concordance estimates valid but providing different information? Is one more appropriate for measuring "performance" (in the AUC sense) of conditional logistic models? Is it possible that the HUGE SE estimate represents a convergence problem (no warnings were thrown when fit the model), or is this model just useless? Thanks! -- Cooperative Fish and Wildlife Research Unit Zoology and Physiology Dept. University of Wyoming [hidden email] / 914.707.8506 wyocoopunit.org         [[alternative HTML version deleted]] ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Survival::coxph (clogit), survConcordance vs. summary(fit) concordance

 Thanks for pointing that out, Chris. That was a thoughtless typo on my part when I was simplifying my model for the sake of posting. I've run a whole set of models without any problems/warning. My main question is regarding the difference between the concordance estimate that summary(fit) reports and the concordance estimated with survConcordance, particularly in relation to estimating clogit model performance. Also, whether or not I should be concerned about the giant SE estimate I get for concordance from summary(fit). This is within the context of a 1:1 case-control study (1 case and 1 control per strata). Corrected model: fit <- clogit(resp ~ x1 + x2 + strata(ID) + cluster(site), method ="efron", data = dat) Where resp is 1's and 0's, and x1 and x2 are both continuous. The rest of the code and output details should be in my original post. Thanks. Joe On Wed, Jan 20, 2016 at 6:11 AM, Andrews, Chris <[hidden email]> wrote: > I only get the digest, sorry if this has already been answered. > > When I run your code (after creating some data) I get a warning that > "weights are ignored in clogit".  This is a result of miscalling the clogit > function.  The first 2 commas should be +s. > > library(survival) > nn <- 1000 > dat <- data.frame(resp = rbinom(nn, 1, 0.5), x1=rnorm(nn), x2=rnorm(nn), > ID = rep(seq(nn/2), e=2), site = rep(seq(nn/10), e=10)) > fit <- clogit(resp ~ x1 + x2, strata(ID), cluster(site), method ="efron", > data = dat) # warning > fit <- clogit(resp ~ x1 + x2 + strata(ID) + cluster(site), method > ="efron", data = dat) # no warning > summary(fit) > > Chris > > -----Original Message----- > From: Joe Ceradini [mailto:[hidden email]] > Sent: Tuesday, January 19, 2016 12:48 PM > To: [hidden email] > Subject: [R] Survival::coxph (clogit), survConcordance vs. summary(fit) > concordance > > Hi, > > I'm running conditional logistic regression with survival::clogit. I have > "1-1 case-control" data, i.e., there is 1 case and 1 control in each > strata. > > Model: > fit <- clogit(resp ~ x1 + x2, strata(ID), cluster(site), method ="efron", > data = dat) > Where resp is 1's and 0's, and x1 and x2 are both continuous. > > Predictors are both significant. A snippet of summary(fit): > Concordance= 0.763  (se = 0.5 ) > Rsquare= 0.304   (max possible= 0.5 ) > Likelihood ratio test= 27.54  on 2 df,   p=1.047e-06 > Wald test            = 17.19  on 2 df,   p=0.0001853 > Score (logrank) test = 17.43  on 2 df,   p=0.0001644,   Robust = 6.66 >  p=0.03574 > > The concordance estimate seems good but the SE is HUGE. > > I get a very different estimate from the survConcordance function, which I > know says computes concordance for a "single continuous covariate", but it > runs on my model with 2 continuous covariates.... > > survConcordance(Surv(rep(1, 76L), resp) ~ predict(fit), dat) > n= 76 > Concordance= 0.9106648 se= 0.09365047 > concordant  discordant   tied.risk   tied.time    std(c-d) >  1315.0000   129.0000     0.0000   703.0000   270.4626 > > Are both of these concordance estimates valid but providing different > information? > Is one more appropriate for measuring "performance" (in the AUC sense) of > conditional logistic models? > Is it possible that the HUGE SE estimate represents a convergence problem > (no warnings were thrown when fit the model), or is this model just > useless? > > Thanks! > -- > Cooperative Fish and Wildlife Research Unit > Zoology and Physiology Dept. > University of Wyoming > [hidden email] / 914.707.8506 > wyocoopunit.org > >         [[alternative HTML version deleted]] > > > ********************************************************** > Electronic Mail is not secure, may not be read every day, and should not > be used for urgent or sensitive issues > -- Cooperative Fish and Wildlife Research Unit Zoology and Physiology Dept. University of Wyoming [hidden email] / 914.707.8506 wyocoopunit.org         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.