[R] Profile confidence intervals and LR chi-square test

5 messages
Open this post in threaded view
|

[R] Profile confidence intervals and LR chi-square test

 System: R 2.3.1 on Windows XP machine. I am building a logistic regression model for a sample of 100 cases in dataframe "d", in which there are 3 binary covariates: x1, x2 and x3. ---------------- > summary(d)  y      x1     x2     x3      0:54   0:50   0:64   0:78    1:46   1:50   1:36   1:22   > fit <- glm(y ~ x1 + x2 + x3, data=d, family=binomial(link=logit)) > summary(fit) Call: glm(formula = y ~ x1 + x2 + x3, family = binomial(link = logit),     data = d) Deviance Residuals:     Min       1Q   Median       3Q      Max   -1.6503  -1.0220  -0.7284   0.9965   1.7069   Coefficients:             Estimate Std. Error z value Pr(>|z|)   (Intercept)  -0.3772     0.3721  -1.014   0.3107   x11          -0.8144     0.4422  -1.842   0.0655 . x21           0.9226     0.4609   2.002   0.0453 * x31           1.3347     0.5576   2.394   0.0167 * --- Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1)     Null deviance: 137.99  on 99  degrees of freedom Residual deviance: 120.65  on 96  degrees of freedom AIC: 128.65 Number of Fisher Scoring iterations: 4 > exp(fit\$coef) (Intercept)         x11         x21         x31   0.6858006   0.4429233   2.5157321   3.7989873 --------------- After reading the appropriate sections in MASS4 (7.2 and 8.4 in particular), I decided to estimate the 95% confidence intervals for the odds ratios using the profile method implemented in the "confint" function. I then used the "anova" function to perform the deviance chi-square tests for each covariate. --------------- > ci <- confint(fit); exp(ci) Waiting for profiling to be done...                 2.5 %    97.5 % (Intercept) 0.3246680  1.413684 x11         0.1834819  1.048154 x21         1.0256096  6.314473 x31         1.3221533 12.129210 > anova(fit, test='Chisq') Analysis of Deviance Table Model: binomial, link: logit Response: y Terms added sequentially (first to last)      Df Deviance Resid. Df Resid. Dev P(>|Chi|) NULL                    99    137.989           x1    1    5.856        98    132.133     0.016 x2    1    5.271        97    126.862     0.022 x3    1    6.212        96    120.650     0.013 ---------------- My question relates to the interpretation of the significance of variable x1.  The OR for x1 is 0.443 and its profile confidence interval is 0.183-1.048.  If a type I error rate of 5% is assumed, this result would tend to suggest that x1 is NOT a significant predictor of y. However, the deviance chi-square test has a P-value of 0.016, which suggests that x1 is indeed a significant predictor of y. How do I reconcile these two differing messages? I do recognize that the upper bound of the confidence interval is pretty close to 1, but I am certain that some journal reviewer will point out the problem as inconsistent. Brant Inman ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

Re: [R] Profile confidence intervals and LR chi-square test

 On 2006-11-14 00:41, Inman, Brant A. M.D. skrev: > System: R 2.3.1 on Windows XP machine. Time to upgrade! > > I am building a logistic regression model for a sample of 100 cases in > dataframe "d", in which there are 3 binary covariates: x1, x2 and x3. Please provide a reproducible example (as suggested by the posting guide). > > ---------------- > >> summary(d) >  y      x1     x2     x3     >  0:54   0:50   0:64   0:78   >  1:46   1:50   1:36   1:22   > >> fit <- glm(y ~ x1 + x2 + x3, data=d, family=binomial(link=logit)) > >> summary(fit) > > Call: > glm(formula = y ~ x1 + x2 + x3, family = binomial(link = logit), >     data = d) > > Deviance Residuals: >     Min       1Q   Median       3Q      Max   > -1.6503  -1.0220  -0.7284   0.9965   1.7069   > > Coefficients: >             Estimate Std. Error z value Pr(>|z|)   > (Intercept)  -0.3772     0.3721  -1.014   0.3107   > x11          -0.8144     0.4422  -1.842   0.0655 . > x21           0.9226     0.4609   2.002   0.0453 * > x31           1.3347     0.5576   2.394   0.0167 * > --- > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > (Dispersion parameter for binomial family taken to be 1) > >     Null deviance: 137.99  on 99  degrees of freedom > Residual deviance: 120.65  on 96  degrees of freedom > AIC: 128.65 > > Number of Fisher Scoring iterations: 4 > >> exp(fit\$coef) > (Intercept)         x11         x21         x31 >   0.6858006   0.4429233   2.5157321   3.7989873 > --------------- > > After reading the appropriate sections in MASS4 (7.2 and 8.4 in > particular), I decided to estimate the 95% confidence intervals for the > odds ratios using the profile method implemented in the "confint" > function. I then used the "anova" function to perform the deviance > chi-square tests for each covariate. > > --------------- >> ci <- confint(fit); exp(ci) > Waiting for profiling to be done... >                 2.5 %    97.5 % > (Intercept) 0.3246680  1.413684 > x11         0.1834819  1.048154 > x21         1.0256096  6.314473 > x31         1.3221533 12.129210 > >> anova(fit, test='Chisq') > Analysis of Deviance Table > > Model: binomial, link: logit > > Response: y > > Terms added sequentially (first to last)                ^^^^^^^^^^^^ Hence, your use of the `anova' function doesn't return tests corresponding to the CIs computed above. > > >      Df Deviance Resid. Df Resid. Dev P(>|Chi|) > NULL                    99    137.989           > x1    1    5.856        98    132.133     0.016 > x2    1    5.271        97    126.862     0.022 > x3    1    6.212        96    120.650     0.013 > ---------------- > > My question relates to the interpretation of the significance of > variable x1.  The OR for x1 is 0.443 and its profile confidence interval > is 0.183-1.048.  If a type I error rate of 5% is assumed, this result > would tend to suggest that x1 is NOT a significant predictor of y. This is also suggested by the Wald test computed by the `summary' function. > However, the deviance chi-square test has a P-value of 0.016, which > suggests that x1 is indeed a significant predictor of y. How do I That p-value corresponds to adding x1 to a model containing only the intercept term. > reconcile these two differing messages? I do recognize that the upper Generally, in order to compute the LR test for the null hypothesis of some subset of the parameters being equal to zero, you need to explicitly fit both the restricted and the unrestricted model and compare them using the `anova' function. Also, see FAQ 7.18. HTH, Henric > bound of the confidence interval is pretty close to 1, but I am certain > that some journal reviewer will point out the problem as inconsistent. > > Brant Inman > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

Re: [R] Profile confidence intervals and LR chi-square test

 In reply to this post by Brant Inman Your problem is the interpretation of anova(): it is a sequential test and x1 is the first term.  Using dropterm() would give you the correct LR test. However, you also have a Wald test given by the line > x11          -0.8144     0.4422  -1.842   0.0655 . which is not significant at the 5% level.  The correct LRT would be expected to be more accurate, and your inversion of the profile likelihood is just a way to compute the LRT. On Mon, 13 Nov 2006, Inman, Brant A.   M.D. wrote: > > System: R 2.3.1 on Windows XP machine. > > I am building a logistic regression model for a sample of 100 cases in > dataframe "d", in which there are 3 binary covariates: x1, x2 and x3. > > ---------------- > >> summary(d) > y      x1     x2     x3 > 0:54   0:50   0:64   0:78 > 1:46   1:50   1:36   1:22 > >> fit <- glm(y ~ x1 + x2 + x3, data=d, family=binomial(link=logit)) > >> summary(fit) > > Call: > glm(formula = y ~ x1 + x2 + x3, family = binomial(link = logit), >    data = d) > > Deviance Residuals: >    Min       1Q   Median       3Q      Max > -1.6503  -1.0220  -0.7284   0.9965   1.7069 > > Coefficients: >            Estimate Std. Error z value Pr(>|z|) > (Intercept)  -0.3772     0.3721  -1.014   0.3107 > x11          -0.8144     0.4422  -1.842   0.0655 . > x21           0.9226     0.4609   2.002   0.0453 * > x31           1.3347     0.5576   2.394   0.0167 * > --- > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > (Dispersion parameter for binomial family taken to be 1) > >    Null deviance: 137.99  on 99  degrees of freedom > Residual deviance: 120.65  on 96  degrees of freedom > AIC: 128.65 > > Number of Fisher Scoring iterations: 4 > >> exp(fit\$coef) > (Intercept)         x11         x21         x31 >  0.6858006   0.4429233   2.5157321   3.7989873 > --------------- > > After reading the appropriate sections in MASS4 (7.2 and 8.4 in > particular), I decided to estimate the 95% confidence intervals for the > odds ratios using the profile method implemented in the "confint" > function. I then used the "anova" function to perform the deviance > chi-square tests for each covariate. > > --------------- >> ci <- confint(fit); exp(ci) > Waiting for profiling to be done... >                2.5 %    97.5 % > (Intercept) 0.3246680  1.413684 > x11         0.1834819  1.048154 > x21         1.0256096  6.314473 > x31         1.3221533 12.129210 > >> anova(fit, test='Chisq') > Analysis of Deviance Table > > Model: binomial, link: logit > > Response: y > > Terms added sequentially (first to last) > > >     Df Deviance Resid. Df Resid. Dev P(>|Chi|) > NULL                    99    137.989 > x1    1    5.856        98    132.133     0.016 > x2    1    5.271        97    126.862     0.022 > x3    1    6.212        96    120.650     0.013 > ---------------- > > My question relates to the interpretation of the significance of > variable x1.  The OR for x1 is 0.443 and its profile confidence interval > is 0.183-1.048.  If a type I error rate of 5% is assumed, this result > would tend to suggest that x1 is NOT a significant predictor of y. > However, the deviance chi-square test has a P-value of 0.016, which > suggests that x1 is indeed a significant predictor of y. How do I > reconcile these two differing messages? I do recognize that the upper > bound of the confidence interval is pretty close to 1, but I am certain > that some journal reviewer will point out the problem as inconsistent. > > Brant Inman > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley,                  [hidden email] Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/University of Oxford,             Tel:  +44 1865 272861 (self) 1 South Parks Road,                     +44 1865 272866 (PA) Oxford OX1 3TG, UK                Fax:  +44 1865 272595 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.