|
Dear R-users,
I want to fit a linear model with Y as response variable and X a categorical variable (with 4 categories), with the aim of comparing the basal category of X (category=1) with category 4. Unfortunately, there is another categorical variable with 2 categories which interact with x and I have to include it, so my model is s "reg3: Y=x*x3". Using fit.contrast to make the contrast (category 1 vs category 4) with options(contrasts=c("contr.treatment", "contr.poly")), it makes the contrast but just for the basal category of x3, (coincident with the corresponding result of summary(reg3)), so that it is not what I am looking for, and it seems that when I write (contrasts=c("contr.sum", "contr.poly")) before fit.contrast, it adjust for x3. Here I send a SMALL EXAMPLE that tries to imiate my problem. library(gmodels) set.seed(100) options(contrasts=c("contr.treatment", "contr.poly")) y <- rnorm(100) x <- cut(rnorm(100, mean=y, sd=0.25),c(-4,-1.5,0,1.5,4)) # 4 CATEGORIES x3 <- cut(rnorm(100, mean=y, sd=8),c(-50,0,50)) # 2 CATEGORIES reg3<-lm(y~ x * x3 ) summary(reg3) # category 1 vs category 4 estimate: 3.84776 , for basal category of x3 fit.contrast(reg3,x,c(-1,0,0,1)) # g4 vs g1 # category 1 vs category 4 estimate: 3.84776 , options(contrasts=c("contr.sum", "contr.poly")) fit.contrast(reg3,x,c(-1,0,0,1)) # g4 vs g1 # category 1 vs category 4 estimate: 4.010439 I deduce that the global comparison of category 1 vs category 4 is 4.01, and not 3.84. Unfortunately, the real variable that interact with x is not categorical but continuous in my real case, and the new model is Y=x*x2. Again, with the contr.treatment option, the comparison of category 1 vs category 4 is done for the intercept, that is, for the numerical variable assumed to be 0. As i am interested in comparing category 1 vs category 4 adjusting for x2 in general terms, I use contr.sum as before, but it does not seem to produce any modification: x2 <- rnorm(100,mean=y,sd=0.5) # NUMERIC reg2 <- lm(y ~ x * x2 ) summary(reg2) # category 1 vs category 4 estimate: 3.067346 for x2=0 options(contrasts=c("contr.treatment", "contr.poly")) fit.contrast(reg2,x,c(-1,0,0,1)) # g4 vs g1 estimate: 3.067346 for x2=0 options(contrasts=c("contr.sum", "contr.poly")) fit.contrast(reg2,x,c(-1,0,0,1)) # g4 vs g1 estimate: 3.067346 for x2=0 The question is: How could I contrast simulaneously in global terms (not intercept and slope separately) if there are differences among category 1 vs category 4 but adjusting for a continuous variable? Or it is not possible to do so, and I have to contrast both (difference of intercepts and difference of slopes separately) and obtain a global conclussion? Thanks a lot in advance, Berta [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Hello Berta,
gmodels::fit.contrasts() simply performs a single-variable contrast using the model you have specified. To perform the more involved contrasts that you are describing, there are two approaches: 1) use the estimable() function in the gmodels package. gmodels::estimable() allows you to provides an arbitrary function to be applied to the estimated model parameters, allowing you to perform any appropriate (or inappropriate) contrast calculation. 2) Use the contrast functions provided by Frank Harrell's Hmisc package. Frank's functions allow you to specify the desired value or level of each parameter for the contrast, and handle unspecified parameters. I hope this helps, -Greg On Oct 9, 2007, at 6:10AM , Berta wrote: > Dear R-users, > I want to fit a linear model with Y as response variable and X a > categorical variable (with 4 categories), with the aim of comparing > the basal category of X (category=1) with category 4. > Unfortunately, there is another categorical variable with 2 > categories which interact with x and I have to include it, so my > model is s "reg3: Y=x*x3". Using fit.contrast to make the contrast > (category 1 vs category 4) with options(contrasts=c > ("contr.treatment", "contr.poly")), it makes the contrast but just > for the basal category of x3, (coincident with the corresponding > result of summary(reg3)), so that it is not what I am looking for, > and it seems that when I write (contrasts=c("contr.sum", > "contr.poly")) before fit.contrast, it adjust for x3. Here I send a > SMALL EXAMPLE that tries to imiate my problem. > > library(gmodels) > set.seed(100) > options(contrasts=c("contr.treatment", "contr.poly")) > y <- rnorm(100) > x <- cut(rnorm(100, mean=y, sd=0.25),c(-4,-1.5,0,1.5,4)) # 4 > CATEGORIES > x3 <- cut(rnorm(100, mean=y, sd=8),c(-50,0,50)) # 2 CATEGORIES > reg3<-lm(y~ x * x3 ) > summary(reg3) # category 1 vs category 4 estimate: 3.84776 , for > basal category of x3 > fit.contrast(reg3,x,c(-1,0,0,1)) # g4 vs g1 # category 1 vs > category 4 estimate: 3.84776 , > options(contrasts=c("contr.sum", "contr.poly")) > fit.contrast(reg3,x,c(-1,0,0,1)) # g4 vs g1 # category 1 vs > category 4 estimate: 4.010439 > > I deduce that the global comparison of category 1 vs category 4 is > 4.01, and not 3.84. > > Unfortunately, the real variable that interact with x is not > categorical but continuous in my real case, and the new model is > Y=x*x2. Again, with the contr.treatment option, the comparison of > category 1 vs category 4 is done for the intercept, that is, for > the numerical variable assumed to be 0. As i am interested in > comparing category 1 vs category 4 adjusting for x2 in general > terms, I use contr.sum as before, but it does not seem to produce > any modification: > > x2 <- rnorm(100,mean=y,sd=0.5) # NUMERIC > reg2 <- lm(y ~ x * x2 ) > summary(reg2) # category 1 vs category 4 estimate: 3.067346 for x2=0 > options(contrasts=c("contr.treatment", "contr.poly")) > fit.contrast(reg2,x,c(-1,0,0,1)) # g4 vs g1 estimate: 3.067346 for > x2=0 > options(contrasts=c("contr.sum", "contr.poly")) > fit.contrast(reg2,x,c(-1,0,0,1)) # g4 vs g1 estimate: 3.067346 for > x2=0 > > The question is: How could I contrast simulaneously in global terms > (not intercept and slope separately) if there are differences among > category 1 vs category 4 but adjusting for a continuous variable? > Or it is not possible to do so, and I have to contrast both > (difference of intercepts and difference of slopes separately) and > obtain a global conclussion? > > Thanks a lot in advance, > > Berta > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by Berta-2-3
Hi Gregory, thank you so much for your answer, the use of estimable() gave me interesting hints. Following the same small example as before, I observed that "fit.contrast" and "estimable" give the same response when contrasting A vs D in the categorial variable x (differences among groups at the intercept term), and estimable allows you to fix the continuous variable x2 to a given value (say 2) and contrast if the differences are significant between groups at this fixed numeric variable (see example + results below). However, in my real data, the resutls are obtained but a warning appears: "Warning message: Degrees of freedom vary among parameters used to construct linear contrast(s): 1. Using the smallest df among the set of parameters. in: estimable(frec.intermixto, c("xD"=1, "edad:xD"=40)). I suspect that something is wrong with the d.f. (it seems that narrower conf. int than expected are obtained). Is it a wrong conclusion to interpretate the p-values? Is there another way to adjuts this, or it is just inappropriate to do this type of contrast? Thanks a lot in advance! Berta SMALL EXPAMPLE library(gmodels) set.seed(100) y <- rnorm(100) x <- cut(rnorm(100, mean=y, sd=0.25),c(-4,-1.5,0,1.5,4)) ; levels(x)<- c("A","B","C","D") x2 <- rnorm(100,mean=y,sd=0.5) reg2 <- lm(y ~ x * x2 ) summary(reg2) fit.contrast(reg2,x,c(-1,0,0,1)) # g4 vs g1 # Estimate Std. Error t value Pr(>|t|) #x c=( -1 0 0 1 ) 3.067346 0.7645143 4.01215 0.0001224165 estimable(reg2, c("xD"=1) ) # Estimate Std. Error t value DF Pr(>|t|) #(0 0 0 1 0 0 0 0) 3.067346 0.7645143 4.01215 92 0.0001224165 estimable(reg2, c("xD"=1, "xD:x2"=2) ) # Estimate Std. Error t value DF Pr(>|t|) #(0 0 0 1 0 0 0 2) 3.836914 0.6408911 5.986843 92 4.083757e-08 win.graph() plot(c(min(x2), max(x2)), c(min(y),max(y)), type="n", xlab="x2", ylab="y") points(c(0, max(x2)), c(-1.936, -1.936+0.032*max(x2)), type="l", lty=1) points(c(0, max(x2)), c(-1.936+3.067, (-1.936+3.067+(0.032+0.3847)*max(x2))), type="l", lty=2) title("XA vs XD") legend(-2,2,c("XA", "XD"), lty=c(1,2)) ########################################################################### ########################################################################### ########################################################################## >From: Gregory Warnes <[hidden email]> >Date: Tue, 9 Oct 2007 09:20:51 -0400 >To: Berta <[hidden email]> >X-Mailer: Apple Mail (2.752.3) >X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch >Cc: [hidden email] >Hello Berta, > >gmodels::fit.contrasts() simply performs a single-variable contrast >using the model you have specified. To perform the more involved >contrasts that you are describing, there are two approaches: > >1) use the estimable() function in the gmodels package. >gmodels::estimable() allows you to provides an arbitrary function to >be applied to the estimated model parameters, allowing you to perform >any appropriate (or inappropriate) contrast calculation. > >2) Use the contrast functions provided by Frank Harrell's Hmisc >package. Frank's functions allow you to specify the desired value >or level of each parameter for the contrast, and handle unspecified >parameters. > >I hope this helps, > >-Greg > > > >On Oct 9, 2007, at 6:10AM , Berta wrote: > > > Dear R-users, > > I want to fit a linear model with Y as response variable and X a > > categorical variable (with 4 categories), with the aim of comparing > > the basal category of X (category=1) with category 4. > > Unfortunately, there is another categorical variable with 2 > > categories which interact with x and I have to include it, so my > > model is s "reg3: Y=x*x3". Using fit.contrast to make the contrast > > (category 1 vs category 4) with options(contrasts=c > > ("contr.treatment", "contr.poly")), it makes the contrast but just > > for the basal category of x3, (coincident with the corresponding > > result of summary(reg3)), so that it is not what I am looking for, > > and it seems that when I write (contrasts=c("contr.sum", > > "contr.poly")) before fit.contrast, it adjust for x3. Here I send a > > SMALL EXAMPLE that tries to imiate my problem. > > > > library(gmodels) > > set.seed(100) > > options(contrasts=c("contr.treatment", "contr.poly")) > > y <- rnorm(100) > > x <- cut(rnorm(100, mean=y, sd=0.25),c(-4,-1.5,0,1.5,4)) # 4 > > CATEGORIES > > x3 <- cut(rnorm(100, mean=y, sd=8),c(-50,0,50)) # 2 CATEGORIES > > reg3<-lm(y~ x * x3 ) > > summary(reg3) # category 1 vs category 4 estimate: 3.84776 , for > > basal category of x3 > > fit.contrast(reg3,x,c(-1,0,0,1)) # g4 vs g1 # category 1 vs > > category 4 estimate: 3.84776 , > > options(contrasts=c("contr.sum", "contr.poly")) > > fit.contrast(reg3,x,c(-1,0,0,1)) # g4 vs g1 # category 1 vs > > category 4 estimate: 4.010439 > > > > I deduce that the global comparison of category 1 vs category 4 is > > 4.01, and not 3.84. > > > > Unfortunately, the real variable that interact with x is not > > categorical but continuous in my real case, and the new model is > > Y=x*x2. Again, with the contr.treatment option, the comparison of > > category 1 vs category 4 is done for the intercept, that is, for > > the numerical variable assumed to be 0. As i am interested in > > comparing category 1 vs category 4 adjusting for x2 in general > > terms, I use contr.sum as before, but it does not seem to produce > > any modification: > > > > x2 <- rnorm(100,mean=y,sd=0.5) # NUMERIC > > reg2 <- lm(y ~ x * x2 ) > > summary(reg2) # category 1 vs category 4 estimate: 3.067346 for x2=0 > > options(contrasts=c("contr.treatment", "contr.poly")) > > fit.contrast(reg2,x,c(-1,0,0,1)) # g4 vs g1 estimate: 3.067346 for > > x2=0 > > options(contrasts=c("contr.sum", "contr.poly")) > > fit.contrast(reg2,x,c(-1,0,0,1)) # g4 vs g1 estimate: 3.067346 for > > x2=0 > > > > The question is: How could I contrast simulaneously in global terms > > (not intercept and slope separately) if there are differences among > > category 1 vs category 4 but adjusting for a continuous variable? > > Or it is not possible to do so, and I have to contrast both > > (difference of intercepts and difference of slopes separately) and > > obtain a global conclussion? > > > > Thanks a lot in advance, > > > > Berta > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > [hidden email] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > > guide.html > > and provide commented, minimal, self-contained, reproducible code. > >______________________________________________ >[hidden email] mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
| Powered by Nabble | Edit this page |
