Shapiro-Welch W value interpretation

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Shapiro-Welch W value interpretation

Omar Baqueiro
Hello,

I have tested a distribution for normality using the Shapiro-Welch
statistic. The result of this is the following:


        Shapiro-Wilk normality test

data: mydata
W = 0.9989, p-value = 0.8791


I know that the p-value > 0.05 (for my purposes) means that the data
IS normally distributed but what I am not sure is with the W value,
what values tell me that the data is normally distributed.   I know
that my data is normally distributed, but what I want to know if how
to interpret the W value, I have read that "if W is very small then
the distribution is probably not normally distributed", but how
"small"  is "very small", and also, what happens is, say W = 0.000001
but the p-value is > my significance level (0.05)? is the hypothesis
rejected?

thank you!

Omar

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Shapiro-Welch W value interpretation

P Ehlers

Omar Baqueiro wrote:

> Hello,
>
> I have tested a distribution for normality using the Shapiro-Welch
> statistic. The result of this is the following:
>
>
>         Shapiro-Wilk normality test
>
> data: mydata
> W = 0.9989, p-value = 0.8791
>
>
> I know that the p-value > 0.05 (for my purposes) means that the data
> IS normally distributed but what I am not sure is with the W value,
> what values tell me that the data is normally distributed.   I know
> that my data is normally distributed, but what I want to know if how
> to interpret the W value, I have read that "if W is very small then
> the distribution is probably not normally distributed", but how
> "small"  is "very small", and also, what happens is, say W = 0.000001
> but the p-value is > my significance level (0.05)? is the hypothesis
> rejected?
>

There is some confusion in your query.
First, how do you know that your data are indeed normally distributed?
That's *not* what the p-value of the test says.
Consider the following result of the Shapiro-Wilk test applied to
a vector x:

data: x
W = 0.9856, p-value = 0.988

Here x was not sampled from a normal distribution (code at end).

Second, the point of a p-value is to formalize decision-making
so that critical regions of tests are converted to p-value intervals.
Thus, your emphasis on the value of W is misplaced. It's
not how small W is but how small it is for the given sample size,
and the p-value takes care of the significance. (This is not to
say, of course, that the distribution of W is not of interest.)

Finally, what exactly, in your view, is "the hypothesis"?

I hope this doesn't sound too critical. I'm trying to be helpful.

Peter Ehlers

> thank you!
>
> Omar
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

set.seed(34); shapiro.test(rexp(10))

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Shapiro-Welch W value interpretation

Omar Baqueiro
Thanks for your answer, I will try to make myself  clearer and do not
worry to point out what I've got wrong.

> There is some confusion in your query.
> First, how do you know that your data are indeed normally distributed?
In this specific case I *know* that my data is normally distributed
because I generated it using a random number generator. I showed it as
an example for my question because I want to be sure I understand the
meaning of the W value.

> That's *not* what the p-value of the test says.

Then maybe I do not understand very well what is the p-value that is
generated in R as, from  my understanding P-values are generally
accepted as "the smallest fixed level at which the the null hypothesis
can be rejected.", therefore if i fix my significance level at 0.01, a
p-value = 0.8791 of will indicate that the null hypothesis is NOT rejected.

> Consider the following result of the Shapiro-Wilk test applied to
> a vector x:
>
> data: x
> W = 0.9856, p-value = 0.988
>
> Here x was not sampled from a normal distribution (code at end).

Then how would those results will be interpreted? and how would they
be compared against:

W = 0.9932, p-value = 0.8996
obtained by the example given in the R shapiro-wilk help page
(http://stat.ethz.ch/R-manual/R-patched/library/stats/html/shapiro.test.html)
rnorm(100, mean = 5, sd = 3)

against for example:
W = 0.9479, p-value = 0.0006035
obtained from the second example in the same page:  runif(100, min =
2, max = 4)


It would seem to me that the p-value indeed follows the logic I
mentioned earlier, that is, the first example IS  anormal distribution
(it was generated by rnorm) and the p-value is for > 0.01 which means
that H_0 will be mostly accepted  while the second example iS NOT
normal (generated with runif) and its p-value is < 0.01 hence  H_0
will not be usually rejected. Here i assume H_0  is that "the sample
is normally distributed".

Is this right? the result given by your sample  puzzled me first but
if you increase the number of samples, the test will give you a lower
p-value :

set.seed(34); shapiro.test(rexp(30))
W = 0.8898, p-value = 0.004773

Which means it is MOST LIKELY not normally distributed (as the p-value
is < 0.01)
>
> Second, the point of a p-value is to formalize decision-making
> so that critical regions of tests are converted to p-value intervals.
> Thus, your emphasis on the value of W is misplaced. It's
> not how small W is but how small it is for the given sample size,
> and the p-value takes care of the significance.
 I know what the p-value means, what I do not know is what the W value
means. What I need to know is if I report somewhere that certain
distribution is normally distributed because after doing the
shapiro-wilk test, the results where  W = 0.9989, p-value = 0.8791,
then I can say that the the H_0 in the shapiro test is true because
the p-value > 0.01 BUT i do not know what to say about W.

> (This is not to
> say, of course, that the distribution of W is not of interest.)
>

And this is were my questions goes, what is the meaning if the W for
the normal distribution? specifically what does the W says *about* the
data? (does it says something? or is it ), this when testing for
normality of course.

> Finally, what exactly, in your view, is "the hypothesis"?
>
The hypothesis is H_0: The sample is distributed normally (isn't that
what Shapiro_Wilk aims to test?)

> I hope this doesn't sound too critical. I'm trying to be helpful.
>
Again, do not worry, and thank you VERY MUCH for your time answering
and reading this. Also, if my tone seems harsh in this mail, it is not
intended as that, I just tried to write down the facts.


Regards,

Omar


On 9/30/07, P Ehlers <[hidden email]> wrote:

>
> Omar Baqueiro wrote:
> > Hello,
> >
> > I have tested a distribution for normality using the Shapiro-Welch
> > statistic. The result of this is the following:
> >
> >
> >         Shapiro-Wilk normality test
> >
> > data: mydata
> > W = 0.9989, p-value = 0.8791
> >
> >
> > I know that the p-value > 0.05 (for my purposes) means that the data
> > IS normally distributed but what I am not sure is with the W value,
> > what values tell me that the data is normally distributed.   I know
> > that my data is normally distributed, but what I want to know if how
> > to interpret the W value, I have read that "if W is very small then
> > the distribution is probably not normally distributed", but how
> > "small"  is "very small", and also, what happens is, say W = 0.000001
> > but the p-value is > my significance level (0.05)? is the hypothesis
> > rejected?
> >
>
> Peter Ehlers
>
> > thank you!
> >
> > Omar
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
> set.seed(34); shapiro.test(rexp(10))
>
>
>


--
Omar Baqueiro Espinosa
Computer Science PhD Candidate
Computer Systems Engineer
Workpage: www.csc.liv.ac.uk/~omar/
HomePage (spanish):http://www.baqueiro.co.uk/
PGP Key available at: www.csc.liv.ac.uk/~omar/pgp.html
_____

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.