Depletion of small p values upon iterative testing of identical normal distributions

classic Classic list List threaded Threaded
5 messages Options
A-2
Reply | Threaded
Open this post in threaded view
|

Depletion of small p values upon iterative testing of identical normal distributions

A-2
Dear all,

I'm performing a t-test on two normal distributions with identical mean &
standard deviation, and repeating this tests a very large number of times to
describe an representative p value distribution in a null case. As a part of
this, the program bins these values in 10 evenly distributed bins between 0
and 1 and reports the number of observations in each bin. What I have
noticed is that even after 500,000 replications the number in my lowest bin
is consistently ~5% smaller than the number in all the other bins, which are
similar within about 1% of each other. Is there any reason, perhaps to do
with random number generation in R or the nature of the normal distribution
simulated by the rnorm function that could explain this depletion?

Here are two key parts of my code to show what functions I'm working with:

#Calculating the p values
while(i<numtests){
Group1<-rnorm(6,-0.0065,0.0837)
Group2<-rnorm(6,-0.0065,0.0837)
PV<-t.test(Group1,Group2)$p.value
pscoresvector<-c(PV,pscoresvector)
i<-i+1
}

#Binning the results
freqtbl1<-binning(pscoresvector,breaks=bins)

Thanks in advance for any insights,

Andrew

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Depletion of small p values upon iterative testing of identical normal distributions

Duncan Murdoch-2
  On 20/09/2010 9:54 AM, A wrote:

> Dear all,
>
> I'm performing a t-test on two normal distributions with identical mean&
> standard deviation, and repeating this tests a very large number of times to
> describe an representative p value distribution in a null case. As a part of
> this, the program bins these values in 10 evenly distributed bins between 0
> and 1 and reports the number of observations in each bin. What I have
> noticed is that even after 500,000 replications the number in my lowest bin
> is consistently ~5% smaller than the number in all the other bins, which are
> similar within about 1% of each other. Is there any reason, perhaps to do
> with random number generation in R or the nature of the normal distribution
> simulated by the rnorm function that could explain this depletion?

No, equal sized bins should expect equal numbers of entries.  But your
code may have errors in it.

This is a very slow, and slightly dangerous way to program R:

> Here are two key parts of my code to show what functions I'm working with:
>
> #Calculating the p values
> while(i<numtests){
> Group1<-rnorm(6,-0.0065,0.0837)
> Group2<-rnorm(6,-0.0065,0.0837)
> PV<-t.test(Group1,Group2)$p.value
> pscoresvector<-c(PV,pscoresvector)
> i<-i+1
> }

The slowness comes because pscoresvector is growing by one entry every
iteration.  The danger comes because the initialization of pscoresvector
is not shown.  If it wasn't initialized, you'll be binning whatever junk
was there before, as well as the new values.

I'd suggest initializing it as

pscoresvector <- numeric(numtests)

and updating using

pscoresvector[i] <- PV

to avoid both problems.

Duncan Murdoch

> #Binning the results
> freqtbl1<-binning(pscoresvector,breaks=bins)
>
> Thanks in advance for any insights,
>
> Andrew
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Depletion of small p values upon iterative testing of identical normal distributions

bbolker
In reply to this post by A-2
A <petlyakov <at> gmail.com> writes:

>
> Dear all,

 [snip]

> Here are two key parts of my code to show what functions I'm working with:
>
> #Calculating the p values
> while(i<numtests){
> Group1<-rnorm(6,-0.0065,0.0837)
> Group2<-rnorm(6,-0.0065,0.0837)
> PV<-t.test(Group1,Group2)$p.value
> pscoresvector<-c(PV,pscoresvector)
> i<-i+1
> }
>
> #Binning the results
> freqtbl1<-binning(pscoresvector,breaks=bins)

Also consider

> f <- function() {
+   Group1 <- rnorm(6,-0.0065,0.0837)
+   Group2 <- rnorm(6,-0.0065,0.0837)
+   t.test(Group1,Group2)$p.value
+ }
> pscoresvector <- replicate(10000,f())
> table(cut(pscoresvector,breaks=...))

  Is your lowest bin perhaps narrower than your other bins
(although admittedly I can't see an easy reason it should be
only 5% smaller -- 50% would be a more typical result of a binning
mistake)?

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Depletion of small p values upon iterative testing of id

Ted.Harding-2
In reply to this post by A-2
On 20-Sep-10 13:54:56, A wrote:

> Dear all,
> I'm performing a t-test on two normal distributions with identical
> mean & standard deviation, and repeating this tests a very large
> number of times to describe an representative p value distribution
> in a null case. As a part of this, the program bins these values
> in 10 evenly distributed bins between 0 and 1 and reports the number
> of observations in each bin. What I have noticed is that even after
> 500,000 replications the number in my lowest bin is consistently ~5%
> smaller than the number in all the other bins, which are similar
> within about 1% of each other. Is there any reason, perhaps to do
> with random number generation in R or the nature of the normal
> distribution simulated by the rnorm function that could explain
> this depletion?
>
> Here are two key parts of my code to show what functions I'm
> working with:
>
>#Calculating the p values
> while(i<numtests){
> Group1<-rnorm(6,-0.0065,0.0837)
> Group2<-rnorm(6,-0.0065,0.0837)
> PV<-t.test(Group1,Group2)$p.value
> pscoresvector<-c(PV,pscoresvector)
> i<-i+1
> }
>
>#Binning the results
> freqtbl1<-binning(pscoresvector,breaks=bins)
>
> Thanks in advance for any insights,
> Andrew

The issue lies in the t-test, not in the random number generation.

Look at '?t.test':
  t.test(x, y = NULL,
         alternative = c("two.sided", "less", "greater"),
         mu = 0, paired = FALSE, var.equal = FALSE,
         conf.level = 0.95, ...)

and note "var.equal = FALSE" as the default, Then:

  var.equal: a logical variable indicating whether to treat the two
        variances as being equal. If ?TRUE? then the pooled variance
        is used to estimate the variance otherwise the Welch (or
        Satterthwaite) approximation to the degrees of freedom is
        used.

So the default t-test is not the t-test you were perhaps expecting!
I used a variant of your code (to be absolutely sure of what I was
doing):

  numtests<-100000 ; PVals <- numeric(numtests)
  for(i in (1:numtests)){
    Group1<-rnorm(6,-0.0065,0.0837)
    Group2<-rnorm(6,-0.0065,0.0837)
    PVals[i] <- t.test(Group1,Group2)$p.value
  }
  hist(PVals,breaks=0.1*(0:10))

and observed similar behaviour to what you report. But, with
"var.equal=TRUE":

  numtests<-100000 ; PVals <- numeric(numtests)
  for(i in (1:numtests)){
    Group1<-rnorm(6,-0.0065,0.0837)
    Group2<-rnorm(6,-0.0065,0.0837)
    PVals[i] <- t.test(Group1,Group2,var.equal=TRUE)$p.value
  }
  hist(PVals,breaks=0.1*(0:10))

all the bin-values were similar.

So what heppans here is that the Welch/Satterthwaite approxumation
does not produce uniformly distributed P-values when the Null
Hyptohesis is true (at any rate for sample sizes s as small as the
6 you are using).

Hoping this helps,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <[hidden email]>
Fax-to-email: +44 (0)870 094 0861
Date: 20-Sep-10                                       Time: 16:08:41
------------------------------ XFMail ------------------------------

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Depletion of small p values upon iterative testing of identical normal distributions

Thomas Lumley
In reply to this post by Duncan Murdoch-2
On Mon, 20 Sep 2010, Duncan Murdoch wrote:

> On 20/09/2010 9:54 AM, A wrote:
>> Dear all,
>>
>> I'm performing a t-test on two normal distributions with identical mean&
>> standard deviation, and repeating this tests a very large number of times
>> to
>> describe an representative p value distribution in a null case. As a part
>> of
>> this, the program bins these values in 10 evenly distributed bins between 0
>> and 1 and reports the number of observations in each bin. What I have
>> noticed is that even after 500,000 replications the number in my lowest bin
>> is consistently ~5% smaller than the number in all the other bins, which
>> are
>> similar within about 1% of each other. Is there any reason, perhaps to do
>> with random number generation in R or the nature of the normal distribution
>> simulated by the rnorm function that could explain this depletion?
>
> No, equal sized bins should expect equal numbers of entries.  But your code
> may have errors in it.
>

It's actually more interesting than that.

Here's the code in my preferred format for small simulations
one.test <- function (m = -0.065, s = 0.0837)
{
     x <- rnorm(6, m, s)
     y <- rnorm(6, m, s)
     t.test(x, y)$p.value
}

pvals <- replicate(1e5, one.test())

Now, binning could be done wrong, and in any case isn't a good way to compare distributions, so we try
    qqplot(pvals, ppoints(1e5))
and, since the issue is with small p-values
    qqplot(log10(pvals), log10(ppoints(1e5)))

and we see that there are in fact fewer small p-values than there should be, starting at about 10^(-2.5).

After thinking about this for a while, we realize that the t.test() function shouldn't be expected to give exactly uniform p-values -- it's not as if it is doing an exact test, after all.

We smugly repeat the exercise with
one.ev.test <- function (m = -0.065, s = 0.0837)
{
     x <- rnorm(6, m, s)
     y <- rnorm(6, m, s)
     t.test(x, y, var.equal=TRUE)$p.value
}
ev.pvals <- replicate(1e5, one.test())

and the problem is solved.

        -thomas

Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.