Bootstrap P-Value

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Bootstrap P-Value

AbouEl-Makarim Aboueissa-3
*Dear All:*

*I am trying to compute the p-value of the bootstrap test; please see
below.*

*In example 1 the p-value agrees with the confidence interval.*
*BUT, in example 2  the p-value DOES NOT agree with the confidence
interval. In Example 2, the p-value should be zero or close to zero.*

*I am not sure what went wrong, or not sure if I missed something.*

*any help would be appreciated.*


*with many thanks*
*abou*



#####  Two - Sample Bootstrap

#####  Source:
http://www.ievbras.ru/ecostat/Kiril/R/Biblio_N/R_Eng/Chernick2011.pdf

#####  Example 1:
#####  ----------



set.seed(1)

n1 <- 29
n1
x1 <- rnorm(n1, 1.143, 0.164) #some random normal variates: mean1 = 1.143
x1

n2 <- 33
n2
x2 <- rnorm(n2, 1.175, 0.169) #2nd random sample: mean2 = 1.175
x2

obs.diff.theta <- mean(x1) - mean(x2)
obs.diff.theta

theta <- as.vector(NULL) #### vector to hold difference estimates

iterations <- 1000

for (i in 1:1000) {                        #bootstrap resamples
 xx1 <- sample(x1, n1, replace = TRUE)
 xx2 <- sample(x2, n2, replace = TRUE)
 theta[i] <- mean(xx1) - mean(xx2)
 }



##### Confidence Interval:
##### --------------------


quantile(theta, probs = c(.025,0.975)) #Efron percentile CI on difference
in means

##### 2.5% 97.5%
##### - 0.1248539 0.0137601


##### P-Value
##### -------

p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)

#####  p.value <- (sum (theta >= obs.diff.theta) + 1)/ (iterations+1)

p.value



#### R OUTPUT

#### > quantile(theta, probs = c(.025,0.975))
####        2.5%       97.5%
#### -0.12647744  0.02099391

#### > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)
#### > p.value
#### [1] 1

#####  Example 2:
#####  ----------


set.seed(5)

n1 <- 29
### n1
x1 <- rnorm(n1, 10.5, 0.15) ######   sample 1 with mean1 = 10.5
### x1

n2 <- 33
### n2
x2 <- rnorm(n2, 1.5, 0.155) #####  Sample 2 with mean2 = 1.5
### x2

obs.diff.theta <- mean(x1) - mean(x2)
obs.diff.theta

theta <- as.vector(NULL) #### vector to hold difference estimates

iterations <- 1000

#####   bootstrap resamples

for (i in 1:1000) {
 xx1 <- sample(x1, n1, replace = TRUE)
 xx2 <- sample(x2, n2, replace = TRUE)
 theta[i] <- mean(xx1) - mean(xx2)
 }



##### Confidence Interval:
##### --------------------


######  CI on difference in means

quantile(theta, probs = c(.025,0.975))



##### P-Value
##### -------

p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)

##### p.value <- (sum (theta >= obs.diff.theta) + 1)/ (iterations+1)

p.value

##### R OUTPUT

####   > ######  CI on difference in means
####   >
####   > quantile(theta, probs = c(.025,0.975))
####       2.5%    97.5%
####   8.908398 9.060601

####   > ##### P-Value
####   > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)

####   > p.value
####   [1] 0.4835165

______________________


*AbouEl-Makarim Aboueissa, PhD*

*Professor, Statistics and Data Science*
*Graduate Coordinator*

*Department of Mathematics and Statistics*
*University of Southern Maine*

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Bootstrap P-Value

glsnow
A p-value is for testing a specific null hypothesis, but you do not
state your null hypothesis anywhere.

It is the null value that needs to be subtracted from the bootstrap
differences, not the observed difference.  By subtracting the observed
difference you are setting a situation where the p-value will always
be about 0.5 or about 1 (depending on 1 tailed or 2 tailed).  If
instead you subtract a null value (such as 0), then the p-values will
be closer to what you are expecting.

On Fri, Nov 6, 2020 at 9:44 AM AbouEl-Makarim Aboueissa
<[hidden email]> wrote:

>
> *Dear All:*
>
> *I am trying to compute the p-value of the bootstrap test; please see
> below.*
>
> *In example 1 the p-value agrees with the confidence interval.*
> *BUT, in example 2  the p-value DOES NOT agree with the confidence
> interval. In Example 2, the p-value should be zero or close to zero.*
>
> *I am not sure what went wrong, or not sure if I missed something.*
>
> *any help would be appreciated.*
>
>
> *with many thanks*
> *abou*
>
>
>
> #####  Two - Sample Bootstrap
>
> #####  Source:
> http://www.ievbras.ru/ecostat/Kiril/R/Biblio_N/R_Eng/Chernick2011.pdf
>
> #####  Example 1:
> #####  ----------
>
>
>
> set.seed(1)
>
> n1 <- 29
> n1
> x1 <- rnorm(n1, 1.143, 0.164) #some random normal variates: mean1 = 1.143
> x1
>
> n2 <- 33
> n2
> x2 <- rnorm(n2, 1.175, 0.169) #2nd random sample: mean2 = 1.175
> x2
>
> obs.diff.theta <- mean(x1) - mean(x2)
> obs.diff.theta
>
> theta <- as.vector(NULL) #### vector to hold difference estimates
>
> iterations <- 1000
>
> for (i in 1:1000) {                        #bootstrap resamples
>  xx1 <- sample(x1, n1, replace = TRUE)
>  xx2 <- sample(x2, n2, replace = TRUE)
>  theta[i] <- mean(xx1) - mean(xx2)
>  }
>
>
>
> ##### Confidence Interval:
> ##### --------------------
>
>
> quantile(theta, probs = c(.025,0.975)) #Efron percentile CI on difference
> in means
>
> ##### 2.5% 97.5%
> ##### - 0.1248539 0.0137601
>
>
> ##### P-Value
> ##### -------
>
> p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)
>
> #####  p.value <- (sum (theta >= obs.diff.theta) + 1)/ (iterations+1)
>
> p.value
>
>
>
> #### R OUTPUT
>
> #### > quantile(theta, probs = c(.025,0.975))
> ####        2.5%       97.5%
> #### -0.12647744  0.02099391
>
> #### > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)
> #### > p.value
> #### [1] 1
>
> #####  Example 2:
> #####  ----------
>
>
> set.seed(5)
>
> n1 <- 29
> ### n1
> x1 <- rnorm(n1, 10.5, 0.15) ######   sample 1 with mean1 = 10.5
> ### x1
>
> n2 <- 33
> ### n2
> x2 <- rnorm(n2, 1.5, 0.155) #####  Sample 2 with mean2 = 1.5
> ### x2
>
> obs.diff.theta <- mean(x1) - mean(x2)
> obs.diff.theta
>
> theta <- as.vector(NULL) #### vector to hold difference estimates
>
> iterations <- 1000
>
> #####   bootstrap resamples
>
> for (i in 1:1000) {
>  xx1 <- sample(x1, n1, replace = TRUE)
>  xx2 <- sample(x2, n2, replace = TRUE)
>  theta[i] <- mean(xx1) - mean(xx2)
>  }
>
>
>
> ##### Confidence Interval:
> ##### --------------------
>
>
> ######  CI on difference in means
>
> quantile(theta, probs = c(.025,0.975))
>
>
>
> ##### P-Value
> ##### -------
>
> p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)
>
> ##### p.value <- (sum (theta >= obs.diff.theta) + 1)/ (iterations+1)
>
> p.value
>
> ##### R OUTPUT
>
> ####   > ######  CI on difference in means
> ####   >
> ####   > quantile(theta, probs = c(.025,0.975))
> ####       2.5%    97.5%
> ####   8.908398 9.060601
>
> ####   > ##### P-Value
> ####   > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)
>
> ####   > p.value
> ####   [1] 0.4835165
>
> ______________________
>
>
> *AbouEl-Makarim Aboueissa, PhD*
>
> *Professor, Statistics and Data Science*
> *Graduate Coordinator*
>
> *Department of Mathematics and Statistics*
> *University of Southern Maine*
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Gregory (Greg) L. Snow Ph.D.
[hidden email]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Bootstrap P-Value

AbouEl-Makarim Aboueissa-3
Dear Greg:

H0: Mean 1- Mean 2 = 0
Ha: Mean 1 - Mean 2 ! = 0

with many thanks
abou
______________________


*AbouEl-Makarim Aboueissa, PhD*

*Professor, Statistics and Data Science*
*Graduate Coordinator*

*Department of Mathematics and Statistics*
*University of Southern Maine*



On Fri, Nov 6, 2020 at 12:35 PM Greg Snow <[hidden email]> wrote:

> A p-value is for testing a specific null hypothesis, but you do not
> state your null hypothesis anywhere.
>
> It is the null value that needs to be subtracted from the bootstrap
> differences, not the observed difference.  By subtracting the observed
> difference you are setting a situation where the p-value will always
> be about 0.5 or about 1 (depending on 1 tailed or 2 tailed).  If
> instead you subtract a null value (such as 0), then the p-values will
> be closer to what you are expecting.
>
> On Fri, Nov 6, 2020 at 9:44 AM AbouEl-Makarim Aboueissa
> <[hidden email]> wrote:
> >
> > *Dear All:*
> >
> > *I am trying to compute the p-value of the bootstrap test; please see
> > below.*
> >
> > *In example 1 the p-value agrees with the confidence interval.*
> > *BUT, in example 2  the p-value DOES NOT agree with the confidence
> > interval. In Example 2, the p-value should be zero or close to zero.*
> >
> > *I am not sure what went wrong, or not sure if I missed something.*
> >
> > *any help would be appreciated.*
> >
> >
> > *with many thanks*
> > *abou*
> >
> >
> >
> > #####  Two - Sample Bootstrap
> >
> > #####  Source:
> > http://www.ievbras.ru/ecostat/Kiril/R/Biblio_N/R_Eng/Chernick2011.pdf
> >
> > #####  Example 1:
> > #####  ----------
> >
> >
> >
> > set.seed(1)
> >
> > n1 <- 29
> > n1
> > x1 <- rnorm(n1, 1.143, 0.164) #some random normal variates: mean1 = 1.143
> > x1
> >
> > n2 <- 33
> > n2
> > x2 <- rnorm(n2, 1.175, 0.169) #2nd random sample: mean2 = 1.175
> > x2
> >
> > obs.diff.theta <- mean(x1) - mean(x2)
> > obs.diff.theta
> >
> > theta <- as.vector(NULL) #### vector to hold difference estimates
> >
> > iterations <- 1000
> >
> > for (i in 1:1000) {                        #bootstrap resamples
> >  xx1 <- sample(x1, n1, replace = TRUE)
> >  xx2 <- sample(x2, n2, replace = TRUE)
> >  theta[i] <- mean(xx1) - mean(xx2)
> >  }
> >
> >
> >
> > ##### Confidence Interval:
> > ##### --------------------
> >
> >
> > quantile(theta, probs = c(.025,0.975)) #Efron percentile CI on difference
> > in means
> >
> > ##### 2.5% 97.5%
> > ##### - 0.1248539 0.0137601
> >
> >
> > ##### P-Value
> > ##### -------
> >
> > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)
> >
> > #####  p.value <- (sum (theta >= obs.diff.theta) + 1)/ (iterations+1)
> >
> > p.value
> >
> >
> >
> > #### R OUTPUT
> >
> > #### > quantile(theta, probs = c(.025,0.975))
> > ####        2.5%       97.5%
> > #### -0.12647744  0.02099391
> >
> > #### > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/
> (iterations+1)
> > #### > p.value
> > #### [1] 1
> >
> > #####  Example 2:
> > #####  ----------
> >
> >
> > set.seed(5)
> >
> > n1 <- 29
> > ### n1
> > x1 <- rnorm(n1, 10.5, 0.15) ######   sample 1 with mean1 = 10.5
> > ### x1
> >
> > n2 <- 33
> > ### n2
> > x2 <- rnorm(n2, 1.5, 0.155) #####  Sample 2 with mean2 = 1.5
> > ### x2
> >
> > obs.diff.theta <- mean(x1) - mean(x2)
> > obs.diff.theta
> >
> > theta <- as.vector(NULL) #### vector to hold difference estimates
> >
> > iterations <- 1000
> >
> > #####   bootstrap resamples
> >
> > for (i in 1:1000) {
> >  xx1 <- sample(x1, n1, replace = TRUE)
> >  xx2 <- sample(x2, n2, replace = TRUE)
> >  theta[i] <- mean(xx1) - mean(xx2)
> >  }
> >
> >
> >
> > ##### Confidence Interval:
> > ##### --------------------
> >
> >
> > ######  CI on difference in means
> >
> > quantile(theta, probs = c(.025,0.975))
> >
> >
> >
> > ##### P-Value
> > ##### -------
> >
> > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)
> >
> > ##### p.value <- (sum (theta >= obs.diff.theta) + 1)/ (iterations+1)
> >
> > p.value
> >
> > ##### R OUTPUT
> >
> > ####   > ######  CI on difference in means
> > ####   >
> > ####   > quantile(theta, probs = c(.025,0.975))
> > ####       2.5%    97.5%
> > ####   8.908398 9.060601
> >
> > ####   > ##### P-Value
> > ####   > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/
> (iterations+1)
> >
> > ####   > p.value
> > ####   [1] 0.4835165
> >
> > ______________________
> >
> >
> > *AbouEl-Makarim Aboueissa, PhD*
> >
> > *Professor, Statistics and Data Science*
> > *Graduate Coordinator*
> >
> > *Department of Mathematics and Statistics*
> > *University of Southern Maine*
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> [hidden email]
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.