Question on approximations of full logistic regression model

 Hi, I am trying to construct a logistic regression model from my data (104 patients and 25 events). I build a full model consisting of five predictors with the use of penalization by rms package (lrm, pentrace etc) because of events per variable issue. Then, I tried to approximate the full model by step-down technique predicting L from all of the componet variables using ordinary least squares (ols in rms package) as the followings. I would like to know whether I am doing right or not. > library(rms) > plogit <- predict(full.model) > full.ols <- ols(plogit ~ stenosis+x1+x2+ClinicalScore+procedure, sigma=1) > fastbw(full.ols, aics=1e10)  Deleted       Chi-Sq d.f. P      Residual d.f. P      AIC    R2  stenosis       1.41  1    0.2354   1.41   1    0.2354  -0.59 0.991  x2            16.78  1    0.0000  18.19   2    0.0001  14.19 0.882  procedure     26.12  1    0.0000  44.31   3    0.0000  38.31 0.711  ClinicalScore 25.75  1    0.0000  70.06   4    0.0000  62.06 0.544  x1            83.42  1    0.0000 153.49   5    0.0000 143.49 0.000 Then, fitted an approximation to the full model using most imprtant variable (R^2 for predictions from the reduced model against the original Y drops below 0.95), that is, dropping "stenosis". > full.ols.approx <- ols(plogit ~ x1+x2+ClinicalScore+procedure) > full.ols.approx\$stats           n  Model L.R.        d.f.          R2           g       Sigma 104.0000000 487.9006640   4.0000000   0.9908257   1.3341718   0.1192622 This approximate model had R^2 against the full model of 0.99. Therefore, I updated the original full logistic model dropping "stenosis" as predictor. > full.approx.lrm <- update(full.model, ~ . -stenosis) > validate(full.model, bw=F, B=1000)           index.orig training    test optimism index.corrected    n Dxy           0.6425   0.7017  0.6131   0.0887          0.5539 1000 R2            0.3270   0.3716  0.3335   0.0382          0.2888 1000 Intercept     0.0000   0.0000  0.0821  -0.0821          0.0821 1000 Slope         1.0000   1.0000  1.0548  -0.0548          1.0548 1000 Emax          0.0000   0.0000  0.0263   0.0263          0.0263 1000 > validate(full.approx.lrm, bw=F, B=1000)           index.orig training    test optimism index.corrected    n Dxy           0.6446   0.6891  0.6265   0.0626          0.5820 1000 R2            0.3245   0.3592  0.3428   0.0164          0.3081 1000 Intercept     0.0000   0.0000  0.1281  -0.1281          0.1281 1000 Slope         1.0000   1.0000  1.1104  -0.1104          1.1104 1000 Emax          0.0000   0.0000  0.0444   0.0444          0.0444 1000 Validatin revealed this approximation was not bad. Then, I made a nomogram. > full.approx.lrm.nom <- nomogram(full.approx.lrm, fun.at=c(0.05,0.1,0.2,0.4,0.6,0.8,0.9,0.95), fun=plogis) > plot(full.approx.lrm.nom) Another nomogram using ols model, > full.ols.approx.nom <- nomogram(full.ols.approx, fun.at=c(0.05,0.1,0.2,0.4,0.6,0.8,0.9,0.95), fun=plogis) > plot(full.ols.approx.nom) These two nomograms are very similar but a little bit different. My questions are; 1. Am I doing right? 2. Which nomogram is correct I would appreciate your help in advance. -- KH
Re: Question on approximations of full logistic regression model

 I think you are doing this correctly except for one thing.  The validation and other inferential calculations should be done on the full model.  Use the approximate model to get a simpler nomogram but not to get standard errors.  With only dropping one variable you might consider just running the nomogram on the entire model. Frank

Frank Harrell Department of Biostatistics, Vanderbilt University
Re: Question on approximations of full logistic regression model

Re: Question on approximations of full logistic regression model

Re: Question on approximations of full logistic regression model

