Sum of Squares Type I, II, III for ANOVA

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Sum of Squares Type I, II, III for ANOVA

Thanh Tran
Hi everyone,
I'm studying the ANOVA in R and have some questions to share. I investigate
the effects of 4 factors (temperature-3 levels, asphalt content-3 levels,
air voids-2 levels, and sample thickness-3 levels) on the hardness of
asphalt concrete in the tensile test (abbreviated as KIC). These data were
taken from a acticle paper. The codes were wrriten as the follows:

> data = read.csv("Saha research.csv", header =T)
> attach(data)
> tem = as.factor(temperature)
> ac= as.factor (AC)
> av = as.factor(AV)
> thick = as.factor(Thickness)
> model =
lm(KIC~tem+ac+av+thick+tem:ac+tem:av+tem:thick+ac:av+ac:thick+av:thick)
> anova(model) #Type I tests
> library(car) Loading required package: carData >
anova(lm(KIC~tem+ac+av+thick+tem:ac+tem:av+tem:thick+ac:av+ac:thick+av:thick),type=2)
Error: $ operator is invalid for atomic vectors
> options(contrasts = c("contr.sum", "contr.poly"))
> Anova(model,type="3") # Type III tests
> Anova(model,type="2") # Type II tests

With R, three results from Type I, II, and III almost have the same as
follows.

Analysis of Variance Table Response: KIC Df Sum Sq Mean Sq F value Pr(>F)
tem 2 15.3917 7.6958 427.9926 < 2.2e-16 *** ac 2 0.1709 0.0854 4.7510
0.0096967 ** av 1 1.9097 1.9097 106.2055 < 2.2e-16 *** thick 2 0.2041
0.1021 5.6756 0.0040359 ** tem:ac 4 0.5653 0.1413 7.8598 6.973e-06 ***
tem:av 2 1.7192 0.8596 47.8046 < 2.2e-16 *** tem:thick 4 0.0728 0.0182
1.0120 0.4024210 ac:av 2 0.3175 0.1588 8.8297 0.0002154 *** ac:thick 4
0.0883 0.0221 1.2280 0.3003570 av:thick 2 0.0662 0.0331 1.8421 0.1613058
Residuals 190 3.4164 0.0180 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’
0.05 ‘.’ 0.1 ‘ ’ 1

However, these results are different from the results in the article,
especially for the interaction (air voids and sample thickness). The
results presented in the article are as follows:
Analysis of variance for KIC, using Adjusted SS for tests. Source DF Seq SS
Adj MS F-stat P-value Model findings Temperature 2 15.39355 7.69677 426.68
<0.01 Significant AC 2 0.95784 0.47892 26.55 <0.01 Significant AV 1 0.57035
0.57035 31.62 <0.01 Significant Thickness 2 0.20269 0.10135 5.62 <0.01
Significant Temperature⁄AC 4 1.37762 0.34441 19.09 <0.01 Significant
Temperature⁄AV 2 0.8329 0.41645 23.09 <0.01 Significant
Temperature⁄thickness 4 0.07135 0.01784 0.99 0.415 Not significant AC⁄AV 2
0.86557 0.43279 23.99 <0.01 Significant AC⁄thickness 4 0.04337 0.01084 0.6
0.662 Not significant AV⁄thickness 2 0.17394 0.08697 4.82 <0.01 Significant
Error 190 3.42734 0.01804 Total 215 23.91653

Therefore, I wonder that whether there is an error in my code or there is
another type of ANOVA in R. If you could answer my problems, I would be
most grateful.
Best regards,
Nhat Tran
Ps: I also added a CSV file and the paper for practicing R.
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum of Squares Type I, II, III for ANOVA

Fox, John
Dear Nhat Tran,

The output that you show is unreadable and as far as I can see, the data aren't attached, but perhaps the following will help: First, if you want Anova() to compute type III tests, then you have to set the contrasts properly *before* you fit the model, not after. Second, you can specify the model much more compactly as

  mod <- lm(KIC ~ tem*ac + tem*av + tem*thick + ac*av +ac*thick + av*thick)

Finally, as sound general practice, I'd not attach the data, but rather put your recoded variables in the data frame and then specify the data argument to lm().

I hope that this helps,
 John

-----------------------------------------------------------------
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/



> -----Original Message-----
> From: R-help [mailto:[hidden email]] On Behalf Of Thanh Tran
> Sent: Tuesday, November 6, 2018 6:58 PM
> To: [hidden email]
> Subject: [R] Sum of Squares Type I, II, III for ANOVA
>
> Hi everyone,
> I'm studying the ANOVA in R and have some questions to share. I investigate
> the effects of 4 factors (temperature-3 levels, asphalt content-3 levels, air
> voids-2 levels, and sample thickness-3 levels) on the hardness of asphalt
> concrete in the tensile test (abbreviated as KIC). These data were taken from a
> acticle paper. The codes were wrriten as the follows:
>
> > data = read.csv("Saha research.csv", header =T)
> > attach(data)
> > tem = as.factor(temperature)
> > ac= as.factor (AC)
> > av = as.factor(AV)
> > thick = as.factor(Thickness)
> > model =
> lm(KIC~tem+ac+av+thick+tem:ac+tem:av+tem:thick+ac:av+ac:thick+av:thick)
> > anova(model) #Type I tests
> > library(car) Loading required package: carData >
> anova(lm(KIC~tem+ac+av+thick+tem:ac+tem:av+tem:thick+ac:av+ac:thick+av
> :thick),type=2)
> Error: $ operator is invalid for atomic vectors
> > options(contrasts = c("contr.sum", "contr.poly"))
> > Anova(model,type="3") # Type III tests
> > Anova(model,type="2") # Type II tests
>
> With R, three results from Type I, II, and III almost have the same as follows.
>
> Analysis of Variance Table Response: KIC Df Sum Sq Mean Sq F value Pr(>F)
> tem 2 15.3917 7.6958 427.9926 < 2.2e-16 *** ac 2 0.1709 0.0854 4.7510
> 0.0096967 ** av 1 1.9097 1.9097 106.2055 < 2.2e-16 *** thick 2 0.2041
> 0.1021 5.6756 0.0040359 ** tem:ac 4 0.5653 0.1413 7.8598 6.973e-06 ***
> tem:av 2 1.7192 0.8596 47.8046 < 2.2e-16 *** tem:thick 4 0.0728 0.0182
> 1.0120 0.4024210 ac:av 2 0.3175 0.1588 8.8297 0.0002154 *** ac:thick 4
> 0.0883 0.0221 1.2280 0.3003570 av:thick 2 0.0662 0.0331 1.8421 0.1613058
> Residuals 190 3.4164 0.0180 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’
> 0.05 ‘.’ 0.1 ‘ ’ 1
>
> However, these results are different from the results in the article, especially
> for the interaction (air voids and sample thickness). The results presented in
> the article are as follows:
> Analysis of variance for KIC, using Adjusted SS for tests. Source DF Seq SS Adj
> MS F-stat P-value Model findings Temperature 2 15.39355 7.69677 426.68
> <0.01 Significant AC 2 0.95784 0.47892 26.55 <0.01 Significant AV 1 0.57035
> 0.57035 31.62 <0.01 Significant Thickness 2 0.20269 0.10135 5.62 <0.01
> Significant Temperature⁄AC 4 1.37762 0.34441 19.09 <0.01 Significant
> Temperature⁄AV 2 0.8329 0.41645 23.09 <0.01 Significant
> Temperature⁄thickness 4 0.07135 0.01784 0.99 0.415 Not significant AC⁄AV 2
> 0.86557 0.43279 23.99 <0.01 Significant AC⁄thickness 4 0.04337 0.01084 0.6
> 0.662 Not significant AV⁄thickness 2 0.17394 0.08697 4.82 <0.01 Significant
> Error 190 3.42734 0.01804 Total 215 23.91653
>
> Therefore, I wonder that whether there is an error in my code or there is
> another type of ANOVA in R. If you could answer my problems, I would be
> most grateful.
> Best regards,
> Nhat Tran
> Ps: I also added a CSV file and the paper for practicing R.
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum of Squares Type I, II, III for ANOVA

Fox, John
In reply to this post by Thanh Tran
Dear Nhat Tran,

One more thing: You could specify the model even more compactly as

  mod <- lm(KIC ~ (tem + ac + av + thick)^2)

Best,
 John

> -----Original Message-----
> From: R-help [mailto:[hidden email]] On Behalf Of Fox, John
> Sent: Tuesday, November 6, 2018 8:41 PM
> To: Thanh Tran <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [R] Sum of Squares Type I, II, III for ANOVA
>
> Dear Nhat Tran,
>
> The output that you show is unreadable and as far as I can see, the data aren't
> attached, but perhaps the following will help: First, if you want Anova() to
> compute type III tests, then you have to set the contrasts properly *before*
> you fit the model, not after. Second, you can specify the model much more
> compactly as
>
>   mod <- lm(KIC ~ tem*ac + tem*av + tem*thick + ac*av +ac*thick + av*thick)
>
> Finally, as sound general practice, I'd not attach the data, but rather put your
> recoded variables in the data frame and then specify the data argument to
> lm().
>
> I hope that this helps,
>  John
>
> -----------------------------------------------------------------
> John Fox
> Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> Web: https://socialsciences.mcmaster.ca/jfox/
>
>
>
> > -----Original Message-----
> > From: R-help [mailto:[hidden email]] On Behalf Of Thanh
> > Tran
> > Sent: Tuesday, November 6, 2018 6:58 PM
> > To: [hidden email]
> > Subject: [R] Sum of Squares Type I, II, III for ANOVA
> >
> > Hi everyone,
> > I'm studying the ANOVA in R and have some questions to share. I
> > investigate the effects of 4 factors (temperature-3 levels, asphalt
> > content-3 levels, air
> > voids-2 levels, and sample thickness-3 levels) on the hardness of
> > asphalt concrete in the tensile test (abbreviated as KIC). These data
> > were taken from a acticle paper. The codes were wrriten as the follows:
> >
> > > data = read.csv("Saha research.csv", header =T)
> > > attach(data)
> > > tem = as.factor(temperature)
> > > ac= as.factor (AC)
> > > av = as.factor(AV)
> > > thick = as.factor(Thickness)
> > > model =
> > lm(KIC~tem+ac+av+thick+tem:ac+tem:av+tem:thick+ac:av+ac:thick+av:thick
> > )
> > > anova(model) #Type I tests
> > > library(car) Loading required package: carData >
> >
> anova(lm(KIC~tem+ac+av+thick+tem:ac+tem:av+tem:thick+ac:av+ac:thick+av
> > :thick),type=2)
> > Error: $ operator is invalid for atomic vectors
> > > options(contrasts = c("contr.sum", "contr.poly"))
> > > Anova(model,type="3") # Type III tests
> > > Anova(model,type="2") # Type II tests
> >
> > With R, three results from Type I, II, and III almost have the same as follows.
> >
> > Analysis of Variance Table Response: KIC Df Sum Sq Mean Sq F value
> > Pr(>F) tem 2 15.3917 7.6958 427.9926 < 2.2e-16 *** ac 2 0.1709 0.0854
> > 4.7510
> > 0.0096967 ** av 1 1.9097 1.9097 106.2055 < 2.2e-16 *** thick 2 0.2041
> > 0.1021 5.6756 0.0040359 ** tem:ac 4 0.5653 0.1413 7.8598 6.973e-06 ***
> > tem:av 2 1.7192 0.8596 47.8046 < 2.2e-16 *** tem:thick 4 0.0728 0.0182
> > 1.0120 0.4024210 ac:av 2 0.3175 0.1588 8.8297 0.0002154 *** ac:thick 4
> > 0.0883 0.0221 1.2280 0.3003570 av:thick 2 0.0662 0.0331 1.8421
> > 0.1613058 Residuals 190 3.4164 0.0180 --- Signif. codes: 0 ‘***’ 0.001 ‘**’
> 0.01 ‘*’
> > 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> > However, these results are different from the results in the article,
> > especially for the interaction (air voids and sample thickness). The
> > results presented in the article are as follows:
> > Analysis of variance for KIC, using Adjusted SS for tests. Source DF
> > Seq SS Adj MS F-stat P-value Model findings Temperature 2 15.39355
> > 7.69677 426.68
> > <0.01 Significant AC 2 0.95784 0.47892 26.55 <0.01 Significant AV 1
> > 0.57035
> > 0.57035 31.62 <0.01 Significant Thickness 2 0.20269 0.10135 5.62 <0.01
> > Significant Temperature⁄AC 4 1.37762 0.34441 19.09 <0.01 Significant
> > Temperature⁄AV 2 0.8329 0.41645 23.09 <0.01 Significant
> > Temperature⁄thickness 4 0.07135 0.01784 0.99 0.415 Not significant
> > AC⁄AV 2
> > 0.86557 0.43279 23.99 <0.01 Significant AC⁄thickness 4 0.04337 0.01084
> > 0.6
> > 0.662 Not significant AV⁄thickness 2 0.17394 0.08697 4.82 <0.01
> > Significant Error 190 3.42734 0.01804 Total 215 23.91653
> >
> > Therefore, I wonder that whether there is an error in my code or there
> > is another type of ANOVA in R. If you could answer my problems, I
> > would be most grateful.
> > Best regards,
> > Nhat Tran
> > Ps: I also added a CSV file and the paper for practicing R.
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html and provide commented, minimal, self-contained,
> > reproducible code.
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum of Squares Type I, II, III for ANOVA

Fox, John
In reply to this post by Fox, John
Dear Thanh Tran,

When you start a discussion on r-help, it's polite to keep it there so other people can see what transpires. I'm consequently cc'ing this response to the r-help list.

The problem with your code is that anova(), as opposed to Anova(), has no type argument.

Here's what I get with your data. I hope that the code and output don't get too mangled:

> data <- read.csv("Saha research.csv", header=TRUE)

> data <- within(data, {
+     tem <- as.factor(temperature)
+     ac <- as.factor (AC)
+     av <- as.factor(AV)
+     thick <- as.factor(Thickness)
+ })

> library(car)
Loading required package: carData

> options(contrasts = c("contr.sum", "contr.poly"))

> mod <- lm(KIC ~ tem*ac + tem*av + tem*thick + ac*av +ac*thick + av*thick,
+           data=data)

> anova(mod) # type I (sequential)
Analysis of Variance Table

Response: KIC
           Df  Sum Sq Mean Sq  F value    Pr(>F)    
tem         2 15.3917  7.6958 427.9926 < 2.2e-16 ***
ac          2  0.1709  0.0854   4.7510 0.0096967 **
av          1  1.9097  1.9097 106.2055 < 2.2e-16 ***
thick       2  0.2041  0.1021   5.6756 0.0040359 **
tem:ac      4  0.5653  0.1413   7.8598 6.973e-06 ***
tem:av      2  1.7192  0.8596  47.8046 < 2.2e-16 ***
tem:thick   4  0.0728  0.0182   1.0120 0.4024210    
ac:av       2  0.3175  0.1588   8.8297 0.0002154 ***
ac:thick    4  0.0883  0.0221   1.2280 0.3003570    
av:thick    2  0.0662  0.0331   1.8421 0.1613058    
Residuals 190  3.4164  0.0180                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

> Anova(mod) # type II
Anova Table (Type II tests)

Response: KIC
           Sum Sq  Df  F value    Pr(>F)    
tem       15.3917   2 427.9926 < 2.2e-16 ***
ac         0.1709   2   4.7510 0.0096967 **
av         1.9097   1 106.2055 < 2.2e-16 ***
thick      0.2041   2   5.6756 0.0040359 **
tem:ac     0.5653   4   7.8598 6.973e-06 ***
tem:av     1.7192   2  47.8046 < 2.2e-16 ***
tem:thick  0.0728   4   1.0120 0.4024210    
ac:av      0.3175   2   8.8297 0.0002154 ***
ac:thick   0.0883   4   1.2280 0.3003570    
av:thick   0.0662   2   1.8421 0.1613058    
Residuals  3.4164 190                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

> Anova(mod, type=3) # type III
Anova Table (Type III tests)

Response: KIC
             Sum Sq  Df   F value    Pr(>F)    
(Intercept) 102.430   1 5696.4740 < 2.2e-16 ***
tem          15.392   2  427.9926 < 2.2e-16 ***
ac            0.171   2    4.7510 0.0096967 **
av            1.910   1  106.2055 < 2.2e-16 ***
thick         0.204   2    5.6756 0.0040359 **
tem:ac        0.565   4    7.8598 6.973e-06 ***
tem:av        1.719   2   47.8046 < 2.2e-16 ***
tem:thick     0.073   4    1.0120 0.4024210    
ac:av         0.318   2    8.8297 0.0002154 ***
ac:thick      0.088   4    1.2280 0.3003570    
av:thick      0.066   2    1.8421 0.1613058    
Residuals     3.416 190                        
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

If you have questions about Minitab there's probably another place to ask. It's not my opinion that type-III tests are generally preferable to type-II tests. Focus, in my opinion, should be on what hypotheses are being tested. If you want to see more detail, you could consult the book with which the car package is associated: see citation(package="car").

Best,
 John

> -----Original Message-----
> From: Thanh Tran [mailto:[hidden email]]
> Sent: Tuesday, November 6, 2018 9:15 PM
> To: Fox, John <[hidden email]>
> Subject: Re: [R] Sum of Squares Type I, II, III for ANOVA
>
> Dear  Prof. John Fox,
> Thank you for your answer. The CSV data was added as the attached file again.
> I try to set the contrasts properly *before* I fit the model but I received a
> problem as follows.
>
> >  setwd("C:/NHAT/HOC TAP/R/Test/Anova") data = read.csv("Saha
> > research.csv", header =T)
> > attach(data)
> > tem = as.factor(temperature)
> > ac= as.factor (AC)
> >  av = as.factor(AV)
> >  thick = as.factor(Thickness)
> > library(car)
> Loading required package: carData
> > options(contrasts = c("contr.sum", "contr.poly")) mod <- lm(KIC ~
> > tem*ac + tem*av + tem*thick + ac*av +ac*thick + av*thick)
> > anova(mod,type= 3)
> Error: $ operator is invalid for atomic vectors
>
>
> Another problem is that in the paper that I read, the authors used MINITAB to
> analyze Anova. The authors use "adjusted sums of squares" calculate the p-
> value. So which should I use? Type I adjusted SS or Type III sequential SS?
> Minitab help tells me that I would "usually" want to use type III adjusted SS, as
> type I sequential "sums of squares can differ when your design is unbalanced"
> - which mine is. The R functions I am using are clearly using the type I
> sequential SS.
>
> Thanks
> Nhat Tran
>
>
> Vào Th 4, 7 thg 11, 2018 vào lúc 10:41 Fox, John <[hidden email]
> <mailto:[hidden email]> > đã viết:
>
>
> Dear Nhat Tran,
>
> The output that you show is unreadable and as far as I can see, the
> data aren't attached, but perhaps the following will help: First, if you want
> Anova() to compute type III tests, then you have to set the contrasts properly
> *before* you fit the model, not after. Second, you can specify the model much
> more compactly as
>
>  mod <- lm(KIC ~ tem*ac + tem*av + tem*thick + ac*av +ac*thick +
> av*thick)
>
> Finally, as sound general practice, I'd not attach the data, but rather
> put your recoded variables in the data frame and then specify the data
> argument to lm().
>
> I hope that this helps,
> John
>
> -----------------------------------------------------------------
> John Fox
> Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> Web: https://socialsciences.mcmaster.ca/jfox/
>
>
>
> > -----Original Message-----
> > From: R-help [mailto:[hidden email] <mailto:r-help-
> [hidden email]> ] On Behalf Of Thanh Tran
> > Sent: Tuesday, November 6, 2018 6:58 PM
> > To: [hidden email] <mailto:[hidden email]>
> > Subject: [R] Sum of Squares Type I, II, III for ANOVA
> >
> > Hi everyone,
> > I'm studying the ANOVA in R and have some questions to share. I
> investigate
> > the effects of 4 factors (temperature-3 levels, asphalt content-3
> levels, air
> > voids-2 levels, and sample thickness-3 levels) on the hardness of
> asphalt
> > concrete in the tensile test (abbreviated as KIC). These data were
> taken from a
> > acticle paper. The codes were wrriten as the follows:
> >
> > > data = read.csv("Saha research.csv", header =T)
> > > attach(data)
> > > tem = as.factor(temperature)
> > > ac= as.factor (AC)
> > > av = as.factor(AV)
> > > thick = as.factor(Thickness)
> > > model =
> >
> lm(KIC~tem+ac+av+thick+tem:ac+tem:av+tem:thick+ac:av+ac:thick+av:thick)
> > > anova(model) #Type I tests
> > > library(car) Loading required package: carData >
> >
> anova(lm(KIC~tem+ac+av+thick+tem:ac+tem:av+tem:thick+ac:av+ac:thick+av
> > :thick),type=2)
> > Error: $ operator is invalid for atomic vectors
> > > options(contrasts = c("contr.sum", "contr.poly"))
> > > Anova(model,type="3") # Type III tests
> > > Anova(model,type="2") # Type II tests
> >
> > With R, three results from Type I, II, and III almost have the same as
> follows.
> >
> > Analysis of Variance Table Response: KIC Df Sum Sq Mean Sq F value
> Pr(>F)
> > tem 2 15.3917 7.6958 427.9926 < 2.2e-16 *** ac 2 0.1709 0.0854
> 4.7510
> > 0.0096967 ** av 1 1.9097 1.9097 106.2055 < 2.2e-16 *** thick 2
> 0.2041
> > 0.1021 5.6756 0.0040359 ** tem:ac 4 0.5653 0.1413 7.8598 6.973e-
> 06 ***
> > tem:av 2 1.7192 0.8596 47.8046 < 2.2e-16 *** tem:thick 4 0.0728
> 0.0182
> > 1.0120 0.4024210 ac:av 2 0.3175 0.1588 8.8297 0.0002154 ***
> ac:thick 4
> > 0.0883 0.0221 1.2280 0.3003570 av:thick 2 0.0662 0.0331 1.8421
> 0.1613058
> > Residuals 190 3.4164 0.0180 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01
> ‘*’
> > 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> > However, these results are different from the results in the article,
> especially
> > for the interaction (air voids and sample thickness). The results
> presented in
> > the article are as follows:
> > Analysis of variance for KIC, using Adjusted SS for tests. Source DF
> Seq SS Adj
> > MS F-stat P-value Model findings Temperature 2 15.39355 7.69677
> 426.68
> > <0.01 Significant AC 2 0.95784 0.47892 26.55 <0.01 Significant AV 1
> 0.57035
> > 0.57035 31.62 <0.01 Significant Thickness 2 0.20269 0.10135 5.62
> <0.01
> > Significant Temperature⁄AC 4 1.37762 0.34441 19.09 <0.01
> Significant
> > Temperature⁄AV 2 0.8329 0.41645 23.09 <0.01 Significant
> > Temperature⁄thickness 4 0.07135 0.01784 0.99 0.415 Not
> significant AC⁄AV 2
> > 0.86557 0.43279 23.99 <0.01 Significant AC⁄thickness 4 0.04337
> 0.01084 0.6
> > 0.662 Not significant AV⁄thickness 2 0.17394 0.08697 4.82 <0.01
> Significant
> > Error 190 3.42734 0.01804 Total 215 23.91653
> >
> > Therefore, I wonder that whether there is an error in my code or
> there is
> > another type of ANOVA in R. If you could answer my problems, I
> would be
> > most grateful.
> > Best regards,
> > Nhat Tran
> > Ps: I also added a CSV file and the paper for practicing R.
> > ______________________________________________
> > [hidden email] <mailto:[hidden email]>  mailing list --
> To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible
> code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sum of Squares Type I, II, III for ANOVA

Thanh Tran
Dear   Prof. John Fox,

Thank you for your advice. I will take care in the future post.

Best regards,
Nhat Tran

Vào Th 4, 7 thg 11, 2018 vào lúc 11:41 Fox, John <[hidden email]> đã
viết:

> Dear Thanh Tran,
>
> When you start a discussion on r-help, it's polite to keep it there so
> other people can see what transpires. I'm consequently cc'ing this response
> to the r-help list.
>
> The problem with your code is that anova(), as opposed to Anova(), has no
> type argument.
>
> Here's what I get with your data. I hope that the code and output don't
> get too mangled:
>
> > data <- read.csv("Saha research.csv", header=TRUE)
>
> > data <- within(data, {
> +     tem <- as.factor(temperature)
> +     ac <- as.factor (AC)
> +     av <- as.factor(AV)
> +     thick <- as.factor(Thickness)
> + })
>
> > library(car)
> Loading required package: carData
>
> > options(contrasts = c("contr.sum", "contr.poly"))
>
> > mod <- lm(KIC ~ tem*ac + tem*av + tem*thick + ac*av +ac*thick +
> av*thick,
> +           data=data)
>
> > anova(mod) # type I (sequential)
> Analysis of Variance Table
>
> Response: KIC
>            Df  Sum Sq Mean Sq  F value    Pr(>F)
> tem         2 15.3917  7.6958 427.9926 < 2.2e-16 ***
> ac          2  0.1709  0.0854   4.7510 0.0096967 **
> av          1  1.9097  1.9097 106.2055 < 2.2e-16 ***
> thick       2  0.2041  0.1021   5.6756 0.0040359 **
> tem:ac      4  0.5653  0.1413   7.8598 6.973e-06 ***
> tem:av      2  1.7192  0.8596  47.8046 < 2.2e-16 ***
> tem:thick   4  0.0728  0.0182   1.0120 0.4024210
> ac:av       2  0.3175  0.1588   8.8297 0.0002154 ***
> ac:thick    4  0.0883  0.0221   1.2280 0.3003570
> av:thick    2  0.0662  0.0331   1.8421 0.1613058
> Residuals 190  3.4164  0.0180
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> > Anova(mod) # type II
> Anova Table (Type II tests)
>
> Response: KIC
>            Sum Sq  Df  F value    Pr(>F)
> tem       15.3917   2 427.9926 < 2.2e-16 ***
> ac         0.1709   2   4.7510 0.0096967 **
> av         1.9097   1 106.2055 < 2.2e-16 ***
> thick      0.2041   2   5.6756 0.0040359 **
> tem:ac     0.5653   4   7.8598 6.973e-06 ***
> tem:av     1.7192   2  47.8046 < 2.2e-16 ***
> tem:thick  0.0728   4   1.0120 0.4024210
> ac:av      0.3175   2   8.8297 0.0002154 ***
> ac:thick   0.0883   4   1.2280 0.3003570
> av:thick   0.0662   2   1.8421 0.1613058
> Residuals  3.4164 190
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> > Anova(mod, type=3) # type III
> Anova Table (Type III tests)
>
> Response: KIC
>              Sum Sq  Df   F value    Pr(>F)
> (Intercept) 102.430   1 5696.4740 < 2.2e-16 ***
> tem          15.392   2  427.9926 < 2.2e-16 ***
> ac            0.171   2    4.7510 0.0096967 **
> av            1.910   1  106.2055 < 2.2e-16 ***
> thick         0.204   2    5.6756 0.0040359 **
> tem:ac        0.565   4    7.8598 6.973e-06 ***
> tem:av        1.719   2   47.8046 < 2.2e-16 ***
> tem:thick     0.073   4    1.0120 0.4024210
> ac:av         0.318   2    8.8297 0.0002154 ***
> ac:thick      0.088   4    1.2280 0.3003570
> av:thick      0.066   2    1.8421 0.1613058
> Residuals     3.416 190
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> If you have questions about Minitab there's probably another place to ask.
> It's not my opinion that type-III tests are generally preferable to type-II
> tests. Focus, in my opinion, should be on what hypotheses are being tested.
> If you want to see more detail, you could consult the book with which the
> car package is associated: see citation(package="car").
>
> Best,
>  John
>
> > -----Original Message-----
> > From: Thanh Tran [mailto:[hidden email]]
> > Sent: Tuesday, November 6, 2018 9:15 PM
> > To: Fox, John <[hidden email]>
> > Subject: Re: [R] Sum of Squares Type I, II, III for ANOVA
> >
> > Dear  Prof. John Fox,
> > Thank you for your answer. The CSV data was added as the attached file
> again.
> > I try to set the contrasts properly *before* I fit the model but I
> received a
> > problem as follows.
> >
> > >  setwd("C:/NHAT/HOC TAP/R/Test/Anova") data = read.csv("Saha
> > > research.csv", header =T)
> > > attach(data)
> > > tem = as.factor(temperature)
> > > ac= as.factor (AC)
> > >  av = as.factor(AV)
> > >  thick = as.factor(Thickness)
> > > library(car)
> > Loading required package: carData
> > > options(contrasts = c("contr.sum", "contr.poly")) mod <- lm(KIC ~
> > > tem*ac + tem*av + tem*thick + ac*av +ac*thick + av*thick)
> > > anova(mod,type= 3)
> > Error: $ operator is invalid for atomic vectors
> >
> >
> > Another problem is that in the paper that I read, the authors used
> MINITAB to
> > analyze Anova. The authors use "adjusted sums of squares" calculate the
> p-
> > value. So which should I use? Type I adjusted SS or Type III sequential
> SS?
> > Minitab help tells me that I would "usually" want to use type III
> adjusted SS, as
> > type I sequential "sums of squares can differ when your design is
> unbalanced"
> > - which mine is. The R functions I am using are clearly using the type I
> > sequential SS.
> >
> > Thanks
> > Nhat Tran
> >
> >
> > Vào Th 4, 7 thg 11, 2018 vào lúc 10:41 Fox, John <[hidden email]
> > <mailto:[hidden email]> > đã viết:
> >
> >
> >       Dear Nhat Tran,
> >
> >       The output that you show is unreadable and as far as I can see, the
> > data aren't attached, but perhaps the following will help: First, if you
> want
> > Anova() to compute type III tests, then you have to set the contrasts
> properly
> > *before* you fit the model, not after. Second, you can specify the model
> much
> > more compactly as
> >
> >         mod <- lm(KIC ~ tem*ac + tem*av + tem*thick + ac*av +ac*thick +
> > av*thick)
> >
> >       Finally, as sound general practice, I'd not attach the data, but
> rather
> > put your recoded variables in the data frame and then specify the data
> > argument to lm().
> >
> >       I hope that this helps,
> >        John
> >
> >       -----------------------------------------------------------------
> >       John Fox
> >       Professor Emeritus
> >       McMaster University
> >       Hamilton, Ontario, Canada
> >       Web: https://socialsciences.mcmaster.ca/jfox/
> >
> >
> >
> >       > -----Original Message-----
> >       > From: R-help [mailto:[hidden email] <mailto:
> r-help-
> > [hidden email]> ] On Behalf Of Thanh Tran
> >       > Sent: Tuesday, November 6, 2018 6:58 PM
> >       > To: [hidden email] <mailto:[hidden email]>
> >       > Subject: [R] Sum of Squares Type I, II, III for ANOVA
> >       >
> >       > Hi everyone,
> >       > I'm studying the ANOVA in R and have some questions to share. I
> > investigate
> >       > the effects of 4 factors (temperature-3 levels, asphalt content-3
> > levels, air
> >       > voids-2 levels, and sample thickness-3 levels) on the hardness of
> > asphalt
> >       > concrete in the tensile test (abbreviated as KIC). These data
> were
> > taken from a
> >       > acticle paper. The codes were wrriten as the follows:
> >       >
> >       > > data = read.csv("Saha research.csv", header =T)
> >       > > attach(data)
> >       > > tem = as.factor(temperature)
> >       > > ac= as.factor (AC)
> >       > > av = as.factor(AV)
> >       > > thick = as.factor(Thickness)
> >       > > model =
> >       >
> > lm(KIC~tem+ac+av+thick+tem:ac+tem:av+tem:thick+ac:av+ac:thick+av:thick)
> >       > > anova(model) #Type I tests
> >       > > library(car) Loading required package: carData >
> >       >
> > anova(lm(KIC~tem+ac+av+thick+tem:ac+tem:av+tem:thick+ac:av+ac:thick+av
> >       > :thick),type=2)
> >       > Error: $ operator is invalid for atomic vectors
> >       > > options(contrasts = c("contr.sum", "contr.poly"))
> >       > > Anova(model,type="3") # Type III tests
> >       > > Anova(model,type="2") # Type II tests
> >       >
> >       > With R, three results from Type I, II, and III almost have the
> same as
> > follows.
> >       >
> >       > Analysis of Variance Table Response: KIC Df Sum Sq Mean Sq F
> value
> > Pr(>F)
> >       > tem 2 15.3917 7.6958 427.9926 < 2.2e-16 *** ac 2 0.1709 0.0854
> > 4.7510
> >       > 0.0096967 ** av 1 1.9097 1.9097 106.2055 < 2.2e-16 *** thick 2
> > 0.2041
> >       > 0.1021 5.6756 0.0040359 ** tem:ac 4 0.5653 0.1413 7.8598 6.973e-
> > 06 ***
> >       > tem:av 2 1.7192 0.8596 47.8046 < 2.2e-16 *** tem:thick 4 0.0728
> > 0.0182
> >       > 1.0120 0.4024210 ac:av 2 0.3175 0.1588 8.8297 0.0002154 ***
> > ac:thick 4
> >       > 0.0883 0.0221 1.2280 0.3003570 av:thick 2 0.0662 0.0331 1.8421
> > 0.1613058
> >       > Residuals 190 3.4164 0.0180 --- Signif. codes: 0 ‘***’ 0.001
> ‘**’ 0.01
> > ‘*’
> >       > 0.05 ‘.’ 0.1 ‘ ’ 1
> >       >
> >       > However, these results are different from the results in the
> article,
> > especially
> >       > for the interaction (air voids and sample thickness). The results
> > presented in
> >       > the article are as follows:
> >       > Analysis of variance for KIC, using Adjusted SS for tests.
> Source DF
> > Seq SS Adj
> >       > MS F-stat P-value Model findings Temperature 2 15.39355 7.69677
> > 426.68
> >       > <0.01 Significant AC 2 0.95784 0.47892 26.55 <0.01 Significant
> AV 1
> > 0.57035
> >       > 0.57035 31.62 <0.01 Significant Thickness 2 0.20269 0.10135 5.62
> > <0.01
> >       > Significant Temperature⁄AC 4 1.37762 0.34441 19.09 <0.01
> > Significant
> >       > Temperature⁄AV 2 0.8329 0.41645 23.09 <0.01 Significant
> >       > Temperature⁄thickness 4 0.07135 0.01784 0.99 0.415 Not
> > significant AC⁄AV 2
> >       > 0.86557 0.43279 23.99 <0.01 Significant AC⁄thickness 4 0.04337
> > 0.01084 0.6
> >       > 0.662 Not significant AV⁄thickness 2 0.17394 0.08697 4.82 <0.01
> > Significant
> >       > Error 190 3.42734 0.01804 Total 215 23.91653
> >       >
> >       > Therefore, I wonder that whether there is an error in my code or
> > there is
> >       > another type of ANOVA in R. If you could answer my problems, I
> > would be
> >       > most grateful.
> >       > Best regards,
> >       > Nhat Tran
> >       > Ps: I also added a CSV file and the paper for practicing R.
> >       > ______________________________________________
> >       > [hidden email] <mailto:[hidden email]>  mailing
> list --
> > To UNSUBSCRIBE and more, see
> >       > https://stat.ethz.ch/mailman/listinfo/r-help
> >       > PLEASE do read the posting guide
> http://www.R-project.org/posting-
> >       > guide.html
> >       > and provide commented, minimal, self-contained, reproducible
> > code.
> >
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.