Thanks for your answer, I will try to make myself clearer and do not

worry to point out what I've got wrong.

> There is some confusion in your query.

> First, how do you know that your data are indeed normally distributed?

In this specific case I *know* that my data is normally distributed

because I generated it using a random number generator. I showed it as

an example for my question because I want to be sure I understand the

meaning of the W value.

> That's *not* what the p-value of the test says.

Then maybe I do not understand very well what is the p-value that is

generated in R as, from my understanding P-values are generally

accepted as "the smallest fixed level at which the the null hypothesis

can be rejected.", therefore if i fix my significance level at 0.01, a

p-value = 0.8791 of will indicate that the null hypothesis is NOT rejected.

> Consider the following result of the Shapiro-Wilk test applied to

> a vector x:

>

> data: x

> W = 0.9856, p-value = 0.988

>

> Here x was not sampled from a normal distribution (code at end).

Then how would those results will be interpreted? and how would they

be compared against:

W = 0.9932, p-value = 0.8996

obtained by the example given in the R shapiro-wilk help page

(

http://stat.ethz.ch/R-manual/R-patched/library/stats/html/shapiro.test.html)

rnorm(100, mean = 5, sd = 3)

against for example:

W = 0.9479, p-value = 0.0006035

obtained from the second example in the same page: runif(100, min =

2, max = 4)

It would seem to me that the p-value indeed follows the logic I

mentioned earlier, that is, the first example IS anormal distribution

(it was generated by rnorm) and the p-value is for > 0.01 which means

that H_0 will be mostly accepted while the second example iS NOT

normal (generated with runif) and its p-value is < 0.01 hence H_0

will not be usually rejected. Here i assume H_0 is that "the sample

is normally distributed".

Is this right? the result given by your sample puzzled me first but

if you increase the number of samples, the test will give you a lower

p-value :

set.seed(34); shapiro.test(rexp(30))

W = 0.8898, p-value = 0.004773

Which means it is MOST LIKELY not normally distributed (as the p-value

is < 0.01)

>

> Second, the point of a p-value is to formalize decision-making

> so that critical regions of tests are converted to p-value intervals.

> Thus, your emphasis on the value of W is misplaced. It's

> not how small W is but how small it is for the given sample size,

> and the p-value takes care of the significance.

I know what the p-value means, what I do not know is what the W value

means. What I need to know is if I report somewhere that certain

distribution is normally distributed because after doing the

shapiro-wilk test, the results where W = 0.9989, p-value = 0.8791,

then I can say that the the H_0 in the shapiro test is true because

the p-value > 0.01 BUT i do not know what to say about W.

> (This is not to

> say, of course, that the distribution of W is not of interest.)

>

And this is were my questions goes, what is the meaning if the W for

the normal distribution? specifically what does the W says *about* the

data? (does it says something? or is it ), this when testing for

normality of course.

> Finally, what exactly, in your view, is "the hypothesis"?

>

The hypothesis is H_0: The sample is distributed normally (isn't that

what Shapiro_Wilk aims to test?)

> I hope this doesn't sound too critical. I'm trying to be helpful.

>

Again, do not worry, and thank you VERY MUCH for your time answering

and reading this. Also, if my tone seems harsh in this mail, it is not

intended as that, I just tried to write down the facts.

Regards,

Omar

On 9/30/07, P Ehlers <

[hidden email]> wrote:

>

> Omar Baqueiro wrote:

> > Hello,

> >

> > I have tested a distribution for normality using the Shapiro-Welch

> > statistic. The result of this is the following:

> >

> >

> > Shapiro-Wilk normality test

> >

> > data: mydata

> > W = 0.9989, p-value = 0.8791

> >

> >

> > I know that the p-value > 0.05 (for my purposes) means that the data

> > IS normally distributed but what I am not sure is with the W value,

> > what values tell me that the data is normally distributed. I know

> > that my data is normally distributed, but what I want to know if how

> > to interpret the W value, I have read that "if W is very small then

> > the distribution is probably not normally distributed", but how

> > "small" is "very small", and also, what happens is, say W = 0.000001

> > but the p-value is > my significance level (0.05)? is the hypothesis

> > rejected?

> >

>

> Peter Ehlers

>

> > thank you!

> >

> > Omar

> >

> > ______________________________________________

> >

[hidden email] mailing list

> >

https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code.

> >

> >

>

> set.seed(34); shapiro.test(rexp(10))

>

>

>

--

Omar Baqueiro Espinosa

Computer Science PhD Candidate

Computer Systems Engineer

Workpage: www.csc.liv.ac.uk/~omar/

HomePage (spanish):

http://www.baqueiro.co.uk/PGP Key available at: www.csc.liv.ac.uk/~omar/pgp.html

_____

______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide

http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.