Implementation of contrasts

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Implementation of contrasts

Fisher Dennis
R 3.2.1
OS X

Colleagues

I am trying to understand ANOVA contrasts and I have encountered some puzzling results.

I have data from five studies with four different combinations of treatments.  I want to make the following comparisons:

Study 1, 3: A / B vs. C / D (i.e., mean of A and B vs. mean of C and D)
Study 2: A vs. B / C
Study 4: A / B / C vs. D
Study 5: A vs. B / C / D

With some trial-and-error, I got the following code to yield results matching SAS outputs:
        AB.CD <- c(1, 1, -1, -1)/2
        A.BC <- c(1, -1, -1)/2
        ABC.D <- c(1, 1, 1, -1)/2
        A.BCD <- c(1, -1, -1, -1)/2
        if (STUDY %in% c(1,3))) contrasts(DATA$TRT)     <- AB.CD
        if (STUDY == 2) contrasts(DATA$TRT)     <- A.BC
        if (STUDY == 4) contrasts(DATA$TRT)     <- ABC.D
        if (STUDY == 5) contrasts(DATA$TRT)     <- A.BCD

AB.CD makes sense to me — take one-half of each of A and B compare to negative one-half of C and D (the contrasts add to zero).
However, I don’t understand how the other contrasts are written (i.e., they don’t add to zero).  For example, I tried:
        A.BC <- c(2, -1, -1) / 2
        ABC.D <- c(1, 1, 1, -3)/3
without success (they yielded results markedly different from SAS)

I have searched the web extensively but the explanations of contrasts in R are not particularly understandable.  Can anyone help me understand the specifics of this situation?  Thanks in advance.

Dennis

Dennis Fisher MD
P < (The "P Less Than" Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Implementation of contrasts

Peter Dalgaard-2

> On 07 Jul 2015, at 16:20 , Dennis Fisher <[hidden email]> wrote:
>
> R 3.2.1
> OS X
>
> Colleagues
>
> I am trying to understand ANOVA contrasts and I have encountered some puzzling results.
>
> I have data from five studies with four different combinations of treatments.  I want to make the following comparisons:
>
> Study 1, 3: A / B vs. C / D (i.e., mean of A and B vs. mean of C and D)
> Study 2: A vs. B / C
> Study 4: A / B / C vs. D
> Study 5: A vs. B / C / D
>
> With some trial-and-error, I got the following code to yield results matching SAS outputs:
> AB.CD <- c(1, 1, -1, -1)/2
> A.BC <- c(1, -1, -1)/2
> ABC.D <- c(1, 1, 1, -1)/2
> A.BCD <- c(1, -1, -1, -1)/2
> if (STUDY %in% c(1,3))) contrasts(DATA$TRT)     <- AB.CD
> if (STUDY == 2) contrasts(DATA$TRT)     <- A.BC
> if (STUDY == 4) contrasts(DATA$TRT)     <- ABC.D
> if (STUDY == 5) contrasts(DATA$TRT)     <- A.BCD
>
> AB.CD makes sense to me — take one-half of each of A and B compare to negative one-half of C and D (the contrasts add to zero).
> However, I don’t understand how the other contrasts are written (i.e., they don’t add to zero).  For example, I tried:
> A.BC <- c(2, -1, -1) / 2
> ABC.D <- c(1, 1, 1, -3)/3
> without success (they yielded results markedly different from SAS)
>
> I have searched the web extensively but the explanations of contrasts in R are not particularly understandable.  Can anyone help me understand the specifics of this situation?  Thanks in advance.

This is a standard pitfall. You have to distinguish between contrasts and contrast parametrizations. One is the contrast parameter as a function of the expected values, the other states the expected values as a function of the contrast parameters. Formulated that way, it is pretty obvious that they shouldn't be identical, but in some cases they actually are, at least kind of...

So e.g.

ABC.D <- c(1, 1, 1, -1)/2

states that the means of group A, B, and C are all at +a/2 but D is at -a/2 so the difference is a. You can orthogonalize the contrast by subtracting the mean, yielding

> x <- c(1,1,1,-1)/2
> x - mean(x)
[1]  0.25  0.25  0.25 -0.75

so now the three groups are at a/4 and D at -3a/4, the difference still being a. However, now they sum to zero and you will notice that they are proportional to (1,1,1,-3)/3. In fact,

> xx <- x - mean(x)
> xx/sum(xx^2)
[1]  0.3333333  0.3333333  0.3333333 -1.0000000

The whole thing is related to the familiar (X'X)^-1 X'Y formula. With the c(1, 1, 1, -1)/2 parametrization, we can do

> cbind(x,1)
        x  
[1,]  0.5 1
[2,]  0.5 1
[3,]  0.5 1
[4,] -0.5 1
> X <- cbind(x,1)
> solve(crossprod(X),t(X))
       [,1]      [,2]      [,3] [,4]
x 0.3333333 0.3333333 0.3333333 -1.0
  0.1666667 0.1666667 0.1666667  0.5

with an orthogonal parametrization, crossprod(X) becomes diagonal and the rows of solve(crossprod(X),t(X)) become proportional to corresponding rows of X.

>
> Dennis
>
> Dennis Fisher MD
> P < (The "P Less Than" Company)
> Phone: 1-866-PLessThan (1-866-753-7784)
> Fax: 1-866-PLessThan (1-866-753-7784)
> www.PLessThan.com
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.