ANOVA Permutation Test

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

ANOVA Permutation Test

Juan Telleria Ruiz de Aguirre
Dear R users,

I have the following Question related to Package lmPerm:

This package uses a modified version of aov() function, which uses
Permutation Tests instead of Normal Theory Tests for fitting an Analysis of
Variance (ANOVA) Model.

However, when I run the following code for a simple linear model:

library(lmPerm)

e$t_Downtime_per_Intervention_Successful %>%
  aovp(
    formula = `Downtime per Intervention[h]` ~ `Working Hours`,
    data = .
  ) %>%
  summary()

I obtain different p-values for each run!

With a regular ANOVA Test, I obtain instead a constant F-statistic, but I
do not fulfill the required Normality Assumptions.

So my questions are:

Would it still be possible use the regular aov() by generating permutations
in advance (Obtaining therefore a Normal Distribution thanks to the Central
Limit Theorem)? And applying the aov() function afterwards? Does it have
sense?


Or maybe this issue could be due to unbalanced classes? I also tried to
weight observations based on proportions, but the function failed.


Any alternative solution for performing a One-Way ANOVA Test over
Non-Normal Data?


Thank you.

Juan

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: ANOVA Permutation Test

Michael Dewey-3
Dear Juan

I do not use the package but if it does permutation tests it presumably
uses random numbers and since you are not setting the seed you would get
different values for each run.

Michael

On 03/09/2018 16:17, Juan Telleria Ruiz de Aguirre wrote:

> Dear R users,
>
> I have the following Question related to Package lmPerm:
>
> This package uses a modified version of aov() function, which uses
> Permutation Tests instead of Normal Theory Tests for fitting an Analysis of
> Variance (ANOVA) Model.
>
> However, when I run the following code for a simple linear model:
>
> library(lmPerm)
>
> e$t_Downtime_per_Intervention_Successful %>%
>    aovp(
>      formula = `Downtime per Intervention[h]` ~ `Working Hours`,
>      data = .
>    ) %>%
>    summary()
>
> I obtain different p-values for each run!
>
> With a regular ANOVA Test, I obtain instead a constant F-statistic, but I
> do not fulfill the required Normality Assumptions.
>
> So my questions are:
>
> Would it still be possible use the regular aov() by generating permutations
> in advance (Obtaining therefore a Normal Distribution thanks to the Central
> Limit Theorem)? And applying the aov() function afterwards? Does it have
> sense?
>
>
> Or maybe this issue could be due to unbalanced classes? I also tried to
> weight observations based on proportions, but the function failed.
>
>
> Any alternative solution for performing a One-Way ANOVA Test over
> Non-Normal Data?
>
>
> Thank you.
>
> Juan
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Michael
http://www.dewey.myzen.co.uk/home.html

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: ANOVA Permutation Test

Meyners, Michael
In reply to this post by Juan Telleria Ruiz de Aguirre
Juan,

Your question might be borderline for this list, as it ultimately rather seems a stats question coming in R disguise.

Anyway, the short answer is that you *expect* to get a different p value from a permutation test unless you are able to do all possible permutation and therefore use the so-called systematic reference set. That is rarely the case, and only for relatively small problems.
The permutation test uses a random subset of all possible permutations. Given this randomness, you'll get a different p value. In order to get reproducible results, you may specify a seed (?set.seed), yet that is only reproducible with this environment. Someone else with a different software and/or code might come out with a different p. The higher the number of permutations used, the smaller the variation around the p values, however. For most applications, 1000 seem good enough to me, but sometimes I go higher (in particular if the p value is borderline and I really need a strict above/below alpha decision).

The permutations do not create an implicit normal distribution, but rather a null distribution that can (likely is depending on non-normality of your data) not normal. So your respective proposal does not appeal.

I don't think you need an alternative - the permutation test is just fine, and recognizing the randomness in the execution does not render the (relatively small) variability in p values a major issue.

You may want to have a look at the text book by Edgington & Onghena for details on permutation tests, and there are plenty of papers out there addressing them in various contexts, which will help to understand *why* you observe what you observe here.

HTH, Michael



> -----Original Message-----
> From: R-help <[hidden email]> On Behalf Of Juan Telleria Ruiz
> de Aguirre
> Sent: Montag, 3. September 2018 17:18
> To: R help Mailing list <[hidden email]>
> Subject: [R] ANOVA Permutation Test
>
> Dear R users,
>
> I have the following Question related to Package lmPerm:
>
> This package uses a modified version of aov() function, which uses
> Permutation Tests instead of Normal Theory Tests for fitting an Analysis of
> Variance (ANOVA) Model.
>
> However, when I run the following code for a simple linear model:
>
> library(lmPerm)
>
> e$t_Downtime_per_Intervention_Successful %>%
>   aovp(
>     formula = `Downtime per Intervention[h]` ~ `Working Hours`,
>     data = .
>   ) %>%
>   summary()
>
> I obtain different p-values for each run!
>
> With a regular ANOVA Test, I obtain instead a constant F-statistic, but I do not
> fulfill the required Normality Assumptions.
>
> So my questions are:
>
> Would it still be possible use the regular aov() by generating permutations in
> advance (Obtaining therefore a Normal Distribution thanks to the Central
> Limit Theorem)? And applying the aov() function afterwards? Does it have
> sense?
>
>
> Or maybe this issue could be due to unbalanced classes? I also tried to weight
> observations based on proportions, but the function failed.
>
>
> Any alternative solution for performing a One-Way ANOVA Test over Non-
> Normal Data?
>
>
> Thank you.
>
> Juan
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: ANOVA Permutation Test

S Ellison-2
In reply to this post by Juan Telleria Ruiz de Aguirre
> This package uses a modified version of aov() function, which uses
> Permutation Tests
>
> I obtain different p-values for each run!

Could that be because you are defaulting to perm="Prob"?

I am not familiar with the package, but the manual is informative.
You may have missed something when reading it.

" ...The Exact method will be used by default when the number of observations is less than or equal to
maxExact, otherwise Prob will be used.
Prob:  Iterations terminate when the estimated standard error of the estimated proportion p is less
than p*Ca"

I would assume that probabilistic permutation is random and will change from run to run.
You could use set.seed() to stop that, but it's actually quite useful to see how much the results change.
If you want complete permutation, you'd need to force Exact (unless that does not mean what it sounds like for this package).
It looks like that requires you to set maxExact to at least your number of observations. But given that permutation  grows combinatorially,  that could take a _long_ time for a run; the Example in the help page does not complete in a useful time when maxExact is set to exceed the number of data points.

So I'd probably run it using Prob and simply note the range of results for a handful of runs to give you an indication of how far to trust the answers.

> Would it still be possible use the regular aov() by generating permutations
> in advance (Obtaining therefore a Normal Distribution thanks to the Central
> Limit Theorem)? And applying the aov() function afterwards? Does it have
> sense?
As a chemist, I'd guess No. And you'd be even more limited in number of permutations.

> Or maybe this issue could be due to unbalanced classes? I also tried to
> weight observations based on proportions, but the function failed.
No, it's nothing to do with balance, if the results change run to run with no change in the model. I'd guess that may exacerbate the permutaiton variability somewhat but it won't _cause_ it.

> Any alternative solution for performing a One-Way ANOVA Test over
> Non-Normal Data?
Yes; the traditional nonparametric test for one-way data (balanced) is the kruskal-wallis test - see ?kruskal.test.
Classical ANOVA on ranks can also be defended as a general 'nonparametric' approach, though I gather it can also be criticised.



*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: ANOVA Permutation Test

Juan Telleria Ruiz de Aguirre
Thank you all for your **very good** answers:

Using aovp(..., perm="Exact") seems to be the way to go for small datasets,
and also I should definitely try ?kruskal.test.


Juan

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.