loop for comparing two or more groups using bootstrapping

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

loop for comparing two or more groups using bootstrapping

Kristi Glover
Hi R users,

I was trying to test a null hypothesis of difference between two groups was 0. I have many years data, such as year1, year2,, year3, year4 and I was trying to compare between year1 and year2, year1 and year3, year2 and year3 and so on and have used following code with an example data.


I tried to make a loop but did not work to compare between many years, and also want to obtain the exact p value. Would you mind to help me to make a loop?

Thanks for your help.


KG


daT<-structure(list(year1 = c(0.417, 0.538, 0.69, 0.688, 0.688, 0.606,

0.667, 0.7, 0.545, 0.462, 0.711, 0.642, 0.744, 0.604, 0.612,

0.667, 0.533, 0.556, 0.444, 0.526, 0.323, 0.308, 0.195, 0.333,

0.323, 0.256, 0.345, 0.205, 0.286, 0.706, 0.7, 0.6, 0.571, 0.364,

0.429, 0.326, 0.571, 0.424, 0.341, 0.387, 0.341, 0.324, 0.696,

0.696, 0.583, 0.556, 0.645, 0.435, 0.471, 0.556), year2 = c(0.385,

0.552, 0.645, 0.516, 0.629, 0.595, 0.72, 0.638, 0.557, 0.588,

0.63, 0.744, 0.773, 0.571, 0.723, 0.769, 0.667, 0.667, 0.526,

0.476, 0.294, 0.323, 0.222, 0.556, 0.263, 0.37, 0.357, 0.25,

0.323, 0.778, 0.667, 0.636, 0.583, 0.432, 0.412, 0.333, 0.571,

0.39, 0.4, 0.452, 0.326, 0.471, 0.7, 0.75, 0.615, 0.462, 0.556,

0.4, 0.696, 0.465), year3 = c(0.435, 0.759, 0.759, 0.759, 0.714,

0.593, 0.651, 0.683, 0.513, 0.643, 0.652, 0.757, 0.791, 0.649,

0.78, 0.5, 0.5, 0.5, 0.533, 0.429, 0.333, 0.286, 0.231, 0.533,

0.303, 0.417, 0.333, 0.333, 0.357, 0.909, 1, 0.952, 0.8, 0.556,

0.529, 0.562, 0.762, 0.513, 0.733, 0.611, 0.733, 0.647, 0.909,

0.857, 0.8, 0.556, 0.588, 0.562, 0.857, 0.513), year4 = c(0.333,

0.533, 0.6, 0.483, 0.743, 0.5, 0.691, 0.619, 0.583, 0.385, 0.653,

0.762, 0.844, 0.64, 0.667, 0.571, 0.571, 0.615, 0.421, 0.5, 0.205,

0.308, 0.25, 0.6, 0.242, 0.308, 0.276, 0.235, 0.211, 0.9, 0.632,

0.72, 0.727, 0.356, 0.5, 0.368, 0.5, 0.41, 0.562, 0.514, 0.4,

0.409, 0.632, 0.72, 0.727, 0.4, 0.5, 0.421, 0.5, 0.462)), .Names = c("year1",

"year2", "year3", "year4"), row.names = c(NA, -50L), class = "data.frame")

head(daT)

# null hypothesis; difference is equal to zero

dif1.2<-daT$year2-daT$year1

k=10000

mysamples1.2=replicate(k, sample(dif1.2, replace=T))

mymeans1.2=apply(mysamples1.2, 2, mean)

quantile(mymeans1.2, c(0.025, 0.975))

hist(mysamples1.2)

mean(mymeans1.2)

#what is p value?


#similarly Now I want to compare between year 1 and year3,

dif1.3<-daT$year3-daT$year1

mysamples1.3=replicate(k, sample(dif1.3, replace=T))

mymeans1.3=apply(mysamples1.3, 2, mean)

quantile(mymeans1.3, c(0.025, 0.975))


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: loop for comparing two or more groups using bootstrapping

Jim Lemon-4
Hi Kristy,
Try this:

colname.mat<-combn(paste0("year",1:4),2)
samplenames<-apply(colname.mat,2,paste,collapse="")
k<-10000
for(column in 1:ncol(colname.mat)) {
 assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,column]]),3,TRUE)))
}

Then use get(samplenames[1]) and so on to access the values.

Jim
On Tue, Sep 11, 2018 at 4:52 PM Kristi Glover <[hidden email]> wrote:

>
> Hi R users,
>
> I was trying to test a null hypothesis of difference between two groups was 0. I have many years data, such as year1, year2,, year3, year4 and I was trying to compare between year1 and year2, year1 and year3, year2 and year3 and so on and have used following code with an example data.
>
>
> I tried to make a loop but did not work to compare between many years, and also want to obtain the exact p value. Would you mind to help me to make a loop?
>
> Thanks for your help.
>
>
> KG
>
>
> daT<-structure(list(year1 = c(0.417, 0.538, 0.69, 0.688, 0.688, 0.606,
>
> 0.667, 0.7, 0.545, 0.462, 0.711, 0.642, 0.744, 0.604, 0.612,
>
> 0.667, 0.533, 0.556, 0.444, 0.526, 0.323, 0.308, 0.195, 0.333,
>
> 0.323, 0.256, 0.345, 0.205, 0.286, 0.706, 0.7, 0.6, 0.571, 0.364,
>
> 0.429, 0.326, 0.571, 0.424, 0.341, 0.387, 0.341, 0.324, 0.696,
>
> 0.696, 0.583, 0.556, 0.645, 0.435, 0.471, 0.556), year2 = c(0.385,
>
> 0.552, 0.645, 0.516, 0.629, 0.595, 0.72, 0.638, 0.557, 0.588,
>
> 0.63, 0.744, 0.773, 0.571, 0.723, 0.769, 0.667, 0.667, 0.526,
>
> 0.476, 0.294, 0.323, 0.222, 0.556, 0.263, 0.37, 0.357, 0.25,
>
> 0.323, 0.778, 0.667, 0.636, 0.583, 0.432, 0.412, 0.333, 0.571,
>
> 0.39, 0.4, 0.452, 0.326, 0.471, 0.7, 0.75, 0.615, 0.462, 0.556,
>
> 0.4, 0.696, 0.465), year3 = c(0.435, 0.759, 0.759, 0.759, 0.714,
>
> 0.593, 0.651, 0.683, 0.513, 0.643, 0.652, 0.757, 0.791, 0.649,
>
> 0.78, 0.5, 0.5, 0.5, 0.533, 0.429, 0.333, 0.286, 0.231, 0.533,
>
> 0.303, 0.417, 0.333, 0.333, 0.357, 0.909, 1, 0.952, 0.8, 0.556,
>
> 0.529, 0.562, 0.762, 0.513, 0.733, 0.611, 0.733, 0.647, 0.909,
>
> 0.857, 0.8, 0.556, 0.588, 0.562, 0.857, 0.513), year4 = c(0.333,
>
> 0.533, 0.6, 0.483, 0.743, 0.5, 0.691, 0.619, 0.583, 0.385, 0.653,
>
> 0.762, 0.844, 0.64, 0.667, 0.571, 0.571, 0.615, 0.421, 0.5, 0.205,
>
> 0.308, 0.25, 0.6, 0.242, 0.308, 0.276, 0.235, 0.211, 0.9, 0.632,
>
> 0.72, 0.727, 0.356, 0.5, 0.368, 0.5, 0.41, 0.562, 0.514, 0.4,
>
> 0.409, 0.632, 0.72, 0.727, 0.4, 0.5, 0.421, 0.5, 0.462)), .Names = c("year1",
>
> "year2", "year3", "year4"), row.names = c(NA, -50L), class = "data.frame")
>
> head(daT)
>
> # null hypothesis; difference is equal to zero
>
> dif1.2<-daT$year2-daT$year1
>
> k=10000
>
> mysamples1.2=replicate(k, sample(dif1.2, replace=T))
>
> mymeans1.2=apply(mysamples1.2, 2, mean)
>
> quantile(mymeans1.2, c(0.025, 0.975))
>
> hist(mysamples1.2)
>
> mean(mymeans1.2)
>
> #what is p value?
>
>
> #similarly Now I want to compare between year 1 and year3,
>
> dif1.3<-daT$year3-daT$year1
>
> mysamples1.3=replicate(k, sample(dif1.3, replace=T))
>
> mymeans1.3=apply(mysamples1.3, 2, mean)
>
> quantile(mymeans1.3, c(0.025, 0.975))
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: loop for comparing two or more groups using bootstrapping

Kristi Glover
Dear Jim,

Thank you very much for the code. I run it but it gave me row names like "year224", "year142".

are these the difference between columns? If we want to get bootstrapping means of difference between years (year2-year1; year3-year1), its CI and exact p value, how can we get it?

thanks

KG

----

head(daT)

colname.mat<-combn(paste0("year",1:4),2)

samplenames<-apply(colname.mat,2,paste,collapse="")

k<-10

for(column in 1:ncol(colname.mat)) {

 assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,column]]),3,TRUE)))

}


> get(samplenames[1])
         [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
year224 0.556 0.667 0.571 0.526 0.629 0.696 0.323 0.526 0.256 0.667
year142 0.324 0.324 0.706 0.638 0.600 0.294 0.612 0.688 0.432 0.387
year237 0.571 0.696 0.629 0.471 0.462 0.471 0.452 0.595 0.333 0.435




________________________________
From: Jim Lemon <[hidden email]>
Sent: September 11, 2018 1:44 AM
To: Kristi Glover
Cc: r-help mailing list
Subject: Re: [R] loop for comparing two or more groups using bootstrapping

Hi Kristy,
Try this:

colname.mat<-combn(paste0("year",1:4),2)
samplenames<-apply(colname.mat,2,paste,collapse="")
k<-10000
for(column in 1:ncol(colname.mat)) {
 assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,column]]),3,TRUE)))
}

Then use get(samplenames[1]) and so on to access the values.

Jim
On Tue, Sep 11, 2018 at 4:52 PM Kristi Glover <[hidden email]> wrote:

>
> Hi R users,
>
> I was trying to test a null hypothesis of difference between two groups was 0. I have many years data, such as year1, year2,, year3, year4 and I was trying to compare between year1 and year2, year1 and year3, year2 and year3 and so on and have used following code with an example data.
>
>
> I tried to make a loop but did not work to compare between many years, and also want to obtain the exact p value. Would you mind to help me to make a loop?
>
> Thanks for your help.
>
>
> KG
>
>
> daT<-structure(list(year1 = c(0.417, 0.538, 0.69, 0.688, 0.688, 0.606,
> 0.667, 0.7, 0.545, 0.462, 0.711, 0.642, 0.744, 0.604, 0.612,
> 0.667, 0.533, 0.556, 0.444, 0.526, 0.323, 0.308, 0.195, 0.333,
> 0.323, 0.256, 0.345, 0.205, 0.286, 0.706, 0.7, 0.6, 0.571, 0.364,
> 0.429, 0.326, 0.571, 0.424, 0.341, 0.387, 0.341, 0.324, 0.696,
> 0.696, 0.583, 0.556, 0.645, 0.435, 0.471, 0.556), year2 = c(0.385,
> 0.552, 0.645, 0.516, 0.629, 0.595, 0.72, 0.638, 0.557, 0.588,
> 0.63, 0.744, 0.773, 0.571, 0.723, 0.769, 0.667, 0.667, 0.526,
> 0.476, 0.294, 0.323, 0.222, 0.556, 0.263, 0.37, 0.357, 0.25,
> 0.323, 0.778, 0.667, 0.636, 0.583, 0.432, 0.412, 0.333, 0.571,
> 0.39, 0.4, 0.452, 0.326, 0.471, 0.7, 0.75, 0.615, 0.462, 0.556,
> 0.4, 0.696, 0.465), year3 = c(0.435, 0.759, 0.759, 0.759, 0.714,
> 0.593, 0.651, 0.683, 0.513, 0.643, 0.652, 0.757, 0.791, 0.649,
> 0.78, 0.5, 0.5, 0.5, 0.533, 0.429, 0.333, 0.286, 0.231, 0.533,
> 0.303, 0.417, 0.333, 0.333, 0.357, 0.909, 1, 0.952, 0.8, 0.556,
> 0.529, 0.562, 0.762, 0.513, 0.733, 0.611, 0.733, 0.647, 0.909,
> 0.857, 0.8, 0.556, 0.588, 0.562, 0.857, 0.513), year4 = c(0.333,
> 0.533, 0.6, 0.483, 0.743, 0.5, 0.691, 0.619, 0.583, 0.385, 0.653,
> 0.762, 0.844, 0.64, 0.667, 0.571, 0.571, 0.615, 0.421, 0.5, 0.205,
> 0.308, 0.25, 0.6, 0.242, 0.308, 0.276, 0.235, 0.211, 0.9, 0.632,
> 0.72, 0.727, 0.356, 0.5, 0.368, 0.5, 0.41, 0.562, 0.514, 0.4,
> 0.409, 0.632, 0.72, 0.727, 0.4, 0.5, 0.421, 0.5, 0.462)), .Names = c("year1",
> "year2", "year3", "year4"), row.names = c(NA, -50L), class = "data.frame")
>
> head(daT)
>
> # null hypothesis; difference is equal to zero
>
> dif1.2<-daT$year2-daT$year1
>
> k=10000
>
> mysamples1.2=replicate(k, sample(dif1.2, replace=T))
>
> mymeans1.2=apply(mysamples1.2, 2, mean)
>
> quantile(mymeans1.2, c(0.025, 0.975))
>
> hist(mysamples1.2)
>
> mean(mymeans1.2)
>
> #what is p value?
>
>
> #similarly Now I want to compare between year 1 and year3,
>
> dif1.3<-daT$year3-daT$year1
>
> mysamples1.3=replicate(k, sample(dif1.3, replace=T))
>
> mymeans1.3=apply(mysamples1.3, 2, mean)
>
> quantile(mymeans1.3, c(0.025, 0.975))
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help

thz.ch/mailman/listinfo/r-help>
stat.ethz.ch
The main R mailing list, for announcements about the development of R and the availability of new code, questions and answers about problems and solutions using R, enhancements and patches to the source code and documentation of R, comparison and compatibility with S and S-plus, and for the posting of nice examples and benchmarks.



> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: loop for comparing two or more groups using bootstrapping

Jim Lemon-4
Hi Kristi,
My fault, I only worked out how to assign the values to the names and pick
out the columns of daT for the calculations. I think this does what you
want, but I can't guarantee the result.

daT<-structure(list(year1=c(0.417,0.538,0.69,0.688,0.688,0.606,
0.667,0.7,0.545,0.462,0.711,0.642,0.744,0.604,0.612,
0.667,0.533,0.556,0.444,0.526,0.323,0.308,0.195,0.333,
0.323,0.256,0.345,0.205,0.286,0.706,0.7,0.6,0.571,0.364,
0.429,0.326,0.571,0.424,0.341,0.387,0.341,0.324,0.696,
0.696,0.583,0.556,0.645,0.435,0.471,0.556),year2=c(0.385,
0.552,0.645,0.516,0.629,0.595,0.72,0.638,0.557,0.588,
0.63,0.744,0.773,0.571,0.723,0.769,0.667,0.667,0.526,
0.476,0.294,0.323,0.222,0.556,0.263,0.37,0.357,0.25,
0.323,0.778,0.667,0.636,0.583,0.432,0.412,0.333,0.571,
0.39,0.4,0.452,0.326,0.471,0.7,0.75,0.615,0.462,0.556,
0.4,0.696,0.465),year3=c(0.435,0.759,0.759,0.759,0.714,
0.593,0.651,0.683,0.513,0.643,0.652,0.757,0.791,0.649,
0.78,0.5,0.5,0.5,0.533,0.429,0.333,0.286,0.231,0.533,
0.303,0.417,0.333,0.333,0.357,0.909,1,0.952,0.8,0.556,
0.529,0.562,0.762,0.513,0.733,0.611,0.733,0.647,0.909,
0.857,0.8,0.556,0.588,0.562,0.857,0.513),year4=c(0.333,
0.533,0.6,0.483,0.743,0.5,0.691,0.619,0.583,0.385,0.653,
0.762,0.844,0.64,0.667,0.571,0.571,0.615,0.421,0.5,0.205,
0.308,0.25,0.6,0.242,0.308,0.276,0.235,0.211,0.9,0.632,
0.72,0.727,0.356,0.5,0.368,0.5,0.41,0.562,0.514,0.4,
0.409,0.632,0.72,0.727,0.4,0.5,0.421,0.5,0.462)),.Names=c("year1",
"year2","year3","year4"),row.names=c(NA,-50L),class="data.frame")
colname.mat<-combn(paste0("year",1:4),2)
samplenames<-apply(colname.mat,2,paste,collapse="")
k<-10000
meandiff<-function(x) return(mean(x[[1]])-mean(x[[2]]))
for(column in 1:ncol(colname.mat)) {
 assign(samplenames[column],
  replicate(k,data.frame(sample(daT[,colname.mat[1,column]],3,TRUE),
   sample(daT[,colname.mat[2,column]],3,TRUE))))
 meandiffs<-unlist(apply(get(samplenames[column]),2,meandiff))
 cat(samplenames[column],"\n")
 cat("mean diff =",mean(meandiffs),"95% CI =",
  quantile(meandiffs,c(0.025,0.975)),"\n")
 png(paste0(samplenames[column],".png")
 hist(meandiffs)
 dev.off()
}

You should get a printout of the means and CIs and  bunch of PNG files with
the histograms.

Jim


On Tue, Sep 11, 2018 at 11:55 PM Kristi Glover <[hidden email]>
wrote:

> Dear Jim,
>
> Thank you very much for the code. I run it but it gave me row names
> like "year224", "year142".
>
> are these the difference between columns? If we want to get bootstrapping
> means of difference between years (year2-year1; year3-year1), its CI and
> exact p value, how can we get it?
>
> thanks
>
> KG
>
> ----
>
> head(daT)
>
> colname.mat<-combn(paste0("year",1:4),2)
>
> samplenames<-apply(colname.mat,2,paste,collapse="")
>
> k<-10
>
> for(column in 1:ncol(colname.mat)) {
>
>  assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,
> column]]),3,TRUE)))
>
> }
>
> > get(samplenames[1])
>          [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
> year224 0.556 0.667 0.571 0.526 0.629 0.696 0.323 0.526 0.256 0.667
> year142 0.324 0.324 0.706 0.638 0.600 0.294 0.612 0.688 0.432 0.387
> year237 0.571 0.696 0.629 0.471 0.462 0.471 0.452 0.595 0.333 0.435
>
>
>
>
> ------------------------------
> *From:* Jim Lemon <[hidden email]>
> *Sent:* September 11, 2018 1:44 AM
> *To:* Kristi Glover
> *Cc:* r-help mailing list
> *Subject:* Re: [R] loop for comparing two or more groups using
> bootstrapping
>
> Hi Kristy,
> Try this:
>
> colname.mat<-combn(paste0("year",1:4),2)
> samplenames<-apply(colname.mat,2,paste,collapse="")
> k<-10000
> for(column in 1:ncol(colname.mat)) {
>
>  assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,column]]),3,TRUE)))
> }
>
> Then use get(samplenames[1]) and so on to access the values.
>
> Jim
> On Tue, Sep 11, 2018 at 4:52 PM Kristi Glover <[hidden email]>
> wrote:
> >
> > Hi R users,
> >
> > I was trying to test a null hypothesis of difference between two groups
> was 0. I have many years data, such as year1, year2,, year3, year4 and I
> was trying to compare between year1 and year2, year1 and year3, year2 and
> year3 and so on and have used following code with an example data.
> >
> >
> > I tried to make a loop but did not work to compare between many years,
> and also want to obtain the exact p value. Would you mind to help me to
> make a loop?
> >
> > Thanks for your help.
> >
> >
> > KG
> >
> >
> > daT<-structure(list(year1 = c(0.417, 0.538, 0.69, 0.688, 0.688, 0.606,
> > 0.667, 0.7, 0.545, 0.462, 0.711, 0.642, 0.744, 0.604, 0.612,
> > 0.667, 0.533, 0.556, 0.444, 0.526, 0.323, 0.308, 0.195, 0.333,
> > 0.323, 0.256, 0.345, 0.205, 0.286, 0.706, 0.7, 0.6, 0.571, 0.364,
> > 0.429, 0.326, 0.571, 0.424, 0.341, 0.387, 0.341, 0.324, 0.696,
> > 0.696, 0.583, 0.556, 0.645, 0.435, 0.471, 0.556), year2 = c(0.385,
> > 0.552, 0.645, 0.516, 0.629, 0.595, 0.72, 0.638, 0.557, 0.588,
> > 0.63, 0.744, 0.773, 0.571, 0.723, 0.769, 0.667, 0.667, 0.526,
> > 0.476, 0.294, 0.323, 0.222, 0.556, 0.263, 0.37, 0.357, 0.25,
> > 0.323, 0.778, 0.667, 0.636, 0.583, 0.432, 0.412, 0.333, 0.571,
> > 0.39, 0.4, 0.452, 0.326, 0.471, 0.7, 0.75, 0.615, 0.462, 0.556,
> > 0.4, 0.696, 0.465), year3 = c(0.435, 0.759, 0.759, 0.759, 0.714,
> > 0.593, 0.651, 0.683, 0.513, 0.643, 0.652, 0.757, 0.791, 0.649,
> > 0.78, 0.5, 0.5, 0.5, 0.533, 0.429, 0.333, 0.286, 0.231, 0.533,
> > 0.303, 0.417, 0.333, 0.333, 0.357, 0.909, 1, 0.952, 0.8, 0.556,
> > 0.529, 0.562, 0.762, 0.513, 0.733, 0.611, 0.733, 0.647, 0.909,
> > 0.857, 0.8, 0.556, 0.588, 0.562, 0.857, 0.513), year4 = c(0.333,
> > 0.533, 0.6, 0.483, 0.743, 0.5, 0.691, 0.619, 0.583, 0.385, 0.653,
> > 0.762, 0.844, 0.64, 0.667, 0.571, 0.571, 0.615, 0.421, 0.5, 0.205,
> > 0.308, 0.25, 0.6, 0.242, 0.308, 0.276, 0.235, 0.211, 0.9, 0.632,
> > 0.72, 0.727, 0.356, 0.5, 0.368, 0.5, 0.41, 0.562, 0.514, 0.4,
> > 0.409, 0.632, 0.72, 0.727, 0.4, 0.5, 0.421, 0.5, 0.462)), .Names =
> c("year1",
> > "year2", "year3", "year4"), row.names = c(NA, -50L), class =
> "data.frame")
> >
> > head(daT)
> >
> > # null hypothesis; difference is equal to zero
> >
> > dif1.2<-daT$year2-daT$year1
> >
> > k=10000
> >
> > mysamples1.2=replicate(k, sample(dif1.2, replace=T))
> >
> > mymeans1.2=apply(mysamples1.2, 2, mean)
> >
> > quantile(mymeans1.2, c(0.025, 0.975))
> >
> > hist(mysamples1.2)
> >
> > mean(mymeans1.2)
> >
> > #what is p value?
> >
> >
> > #similarly Now I want to compare between year 1 and year3,
> >
> > dif1.3<-daT$year3-daT$year1
> >
> > mysamples1.3=replicate(k, sample(dif1.3, replace=T))
> >
> > mymeans1.3=apply(mysamples1.3, 2, mean)
> >
> > quantile(mymeans1.3, c(0.025, 0.975))
> >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> R-help -- Main R Mailing List: Primary help - Homepage - SfS
> <https://stat.ethz.ch/mailman/listinfo/r-help>
> stat.ethz.ch
> The main R mailing list, for announcements about the development of R and
> the availability of new code, questions and answers about problems and
> solutions using R, enhancements and patches to the source code and
> documentation of R, comparison and compatibility with S and S-plus, and for
> the posting of nice examples and benchmarks.
>
>
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: loop for comparing two or more groups using bootstrapping

Marna Wagley
Thank you Jim, it worked. I am very grateful for your help.
Thanks
KG

On Tue, Sep 11, 2018 at 3:51 PM Jim Lemon <[hidden email]> wrote:

> Hi Kristi,
> My fault, I only worked out how to assign the values to the names and pick
> out the columns of daT for the calculations. I think this does what you
> want, but I can't guarantee the result.
>
> daT<-structure(list(year1=c(0.417,0.538,0.69,0.688,0.688,0.606,
> 0.667,0.7,0.545,0.462,0.711,0.642,0.744,0.604,0.612,
> 0.667,0.533,0.556,0.444,0.526,0.323,0.308,0.195,0.333,
> 0.323,0.256,0.345,0.205,0.286,0.706,0.7,0.6,0.571,0.364,
> 0.429,0.326,0.571,0.424,0.341,0.387,0.341,0.324,0.696,
> 0.696,0.583,0.556,0.645,0.435,0.471,0.556),year2=c(0.385,
> 0.552,0.645,0.516,0.629,0.595,0.72,0.638,0.557,0.588,
> 0.63,0.744,0.773,0.571,0.723,0.769,0.667,0.667,0.526,
> 0.476,0.294,0.323,0.222,0.556,0.263,0.37,0.357,0.25,
> 0.323,0.778,0.667,0.636,0.583,0.432,0.412,0.333,0.571,
> 0.39,0.4,0.452,0.326,0.471,0.7,0.75,0.615,0.462,0.556,
> 0.4,0.696,0.465),year3=c(0.435,0.759,0.759,0.759,0.714,
> 0.593,0.651,0.683,0.513,0.643,0.652,0.757,0.791,0.649,
> 0.78,0.5,0.5,0.5,0.533,0.429,0.333,0.286,0.231,0.533,
> 0.303,0.417,0.333,0.333,0.357,0.909,1,0.952,0.8,0.556,
> 0.529,0.562,0.762,0.513,0.733,0.611,0.733,0.647,0.909,
> 0.857,0.8,0.556,0.588,0.562,0.857,0.513),year4=c(0.333,
> 0.533,0.6,0.483,0.743,0.5,0.691,0.619,0.583,0.385,0.653,
> 0.762,0.844,0.64,0.667,0.571,0.571,0.615,0.421,0.5,0.205,
> 0.308,0.25,0.6,0.242,0.308,0.276,0.235,0.211,0.9,0.632,
> 0.72,0.727,0.356,0.5,0.368,0.5,0.41,0.562,0.514,0.4,
> 0.409,0.632,0.72,0.727,0.4,0.5,0.421,0.5,0.462)),.Names=c("year1",
> "year2","year3","year4"),row.names=c(NA,-50L),class="data.frame")
> colname.mat<-combn(paste0("year",1:4),2)
> samplenames<-apply(colname.mat,2,paste,collapse="")
> k<-10000
> meandiff<-function(x) return(mean(x[[1]])-mean(x[[2]]))
> for(column in 1:ncol(colname.mat)) {
>  assign(samplenames[column],
>   replicate(k,data.frame(sample(daT[,colname.mat[1,column]],3,TRUE),
>    sample(daT[,colname.mat[2,column]],3,TRUE))))
>  meandiffs<-unlist(apply(get(samplenames[column]),2,meandiff))
>  cat(samplenames[column],"\n")
>  cat("mean diff =",mean(meandiffs),"95% CI =",
>   quantile(meandiffs,c(0.025,0.975)),"\n")
>  png(paste0(samplenames[column],".png")
>  hist(meandiffs)
>  dev.off()
> }
>
> You should get a printout of the means and CIs and  bunch of PNG files with
> the histograms.
>
> Jim
>
>
> On Tue, Sep 11, 2018 at 11:55 PM Kristi Glover <[hidden email]>
> wrote:
>
> > Dear Jim,
> >
> > Thank you very much for the code. I run it but it gave me row names
> > like "year224", "year142".
> >
> > are these the difference between columns? If we want to get bootstrapping
> > means of difference between years (year2-year1; year3-year1), its CI and
> > exact p value, how can we get it?
> >
> > thanks
> >
> > KG
> >
> > ----
> >
> > head(daT)
> >
> > colname.mat<-combn(paste0("year",1:4),2)
> >
> > samplenames<-apply(colname.mat,2,paste,collapse="")
> >
> > k<-10
> >
> > for(column in 1:ncol(colname.mat)) {
> >
> >  assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,
> > column]]),3,TRUE)))
> >
> > }
> >
> > > get(samplenames[1])
> >          [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
> > year224 0.556 0.667 0.571 0.526 0.629 0.696 0.323 0.526 0.256 0.667
> > year142 0.324 0.324 0.706 0.638 0.600 0.294 0.612 0.688 0.432 0.387
> > year237 0.571 0.696 0.629 0.471 0.462 0.471 0.452 0.595 0.333 0.435
> >
> >
> >
> >
> > ------------------------------
> > *From:* Jim Lemon <[hidden email]>
> > *Sent:* September 11, 2018 1:44 AM
> > *To:* Kristi Glover
> > *Cc:* r-help mailing list
> > *Subject:* Re: [R] loop for comparing two or more groups using
> > bootstrapping
> >
> > Hi Kristy,
> > Try this:
> >
> > colname.mat<-combn(paste0("year",1:4),2)
> > samplenames<-apply(colname.mat,2,paste,collapse="")
> > k<-10000
> > for(column in 1:ncol(colname.mat)) {
> >
> >
> assign(samplenames[column],replicate(k,sample(unlist(daT[,colname.mat[,column]]),3,TRUE)))
> > }
> >
> > Then use get(samplenames[1]) and so on to access the values.
> >
> > Jim
> > On Tue, Sep 11, 2018 at 4:52 PM Kristi Glover <[hidden email]
> >
> > wrote:
> > >
> > > Hi R users,
> > >
> > > I was trying to test a null hypothesis of difference between two groups
> > was 0. I have many years data, such as year1, year2,, year3, year4 and I
> > was trying to compare between year1 and year2, year1 and year3, year2 and
> > year3 and so on and have used following code with an example data.
> > >
> > >
> > > I tried to make a loop but did not work to compare between many years,
> > and also want to obtain the exact p value. Would you mind to help me to
> > make a loop?
> > >
> > > Thanks for your help.
> > >
> > >
> > > KG
> > >
> > >
> > > daT<-structure(list(year1 = c(0.417, 0.538, 0.69, 0.688, 0.688, 0.606,
> > > 0.667, 0.7, 0.545, 0.462, 0.711, 0.642, 0.744, 0.604, 0.612,
> > > 0.667, 0.533, 0.556, 0.444, 0.526, 0.323, 0.308, 0.195, 0.333,
> > > 0.323, 0.256, 0.345, 0.205, 0.286, 0.706, 0.7, 0.6, 0.571, 0.364,
> > > 0.429, 0.326, 0.571, 0.424, 0.341, 0.387, 0.341, 0.324, 0.696,
> > > 0.696, 0.583, 0.556, 0.645, 0.435, 0.471, 0.556), year2 = c(0.385,
> > > 0.552, 0.645, 0.516, 0.629, 0.595, 0.72, 0.638, 0.557, 0.588,
> > > 0.63, 0.744, 0.773, 0.571, 0.723, 0.769, 0.667, 0.667, 0.526,
> > > 0.476, 0.294, 0.323, 0.222, 0.556, 0.263, 0.37, 0.357, 0.25,
> > > 0.323, 0.778, 0.667, 0.636, 0.583, 0.432, 0.412, 0.333, 0.571,
> > > 0.39, 0.4, 0.452, 0.326, 0.471, 0.7, 0.75, 0.615, 0.462, 0.556,
> > > 0.4, 0.696, 0.465), year3 = c(0.435, 0.759, 0.759, 0.759, 0.714,
> > > 0.593, 0.651, 0.683, 0.513, 0.643, 0.652, 0.757, 0.791, 0.649,
> > > 0.78, 0.5, 0.5, 0.5, 0.533, 0.429, 0.333, 0.286, 0.231, 0.533,
> > > 0.303, 0.417, 0.333, 0.333, 0.357, 0.909, 1, 0.952, 0.8, 0.556,
> > > 0.529, 0.562, 0.762, 0.513, 0.733, 0.611, 0.733, 0.647, 0.909,
> > > 0.857, 0.8, 0.556, 0.588, 0.562, 0.857, 0.513), year4 = c(0.333,
> > > 0.533, 0.6, 0.483, 0.743, 0.5, 0.691, 0.619, 0.583, 0.385, 0.653,
> > > 0.762, 0.844, 0.64, 0.667, 0.571, 0.571, 0.615, 0.421, 0.5, 0.205,
> > > 0.308, 0.25, 0.6, 0.242, 0.308, 0.276, 0.235, 0.211, 0.9, 0.632,
> > > 0.72, 0.727, 0.356, 0.5, 0.368, 0.5, 0.41, 0.562, 0.514, 0.4,
> > > 0.409, 0.632, 0.72, 0.727, 0.4, 0.5, 0.421, 0.5, 0.462)), .Names =
> > c("year1",
> > > "year2", "year3", "year4"), row.names = c(NA, -50L), class =
> > "data.frame")
> > >
> > > head(daT)
> > >
> > > # null hypothesis; difference is equal to zero
> > >
> > > dif1.2<-daT$year2-daT$year1
> > >
> > > k=10000
> > >
> > > mysamples1.2=replicate(k, sample(dif1.2, replace=T))
> > >
> > > mymeans1.2=apply(mysamples1.2, 2, mean)
> > >
> > > quantile(mymeans1.2, c(0.025, 0.975))
> > >
> > > hist(mysamples1.2)
> > >
> > > mean(mymeans1.2)
> > >
> > > #what is p value?
> > >
> > >
> > > #similarly Now I want to compare between year 1 and year3,
> > >
> > > dif1.3<-daT$year3-daT$year1
> > >
> > > mysamples1.3=replicate(k, sample(dif1.3, replace=T))
> > >
> > > mymeans1.3=apply(mysamples1.3, 2, mean)
> > >
> > > quantile(mymeans1.3, c(0.025, 0.975))
> > >
> > >
> > >         [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > R-help -- Main R Mailing List: Primary help - Homepage - SfS
> > <https://stat.ethz.ch/mailman/listinfo/r-help>
> > stat.ethz.ch
> > The main R mailing list, for announcements about the development of R and
> > the availability of new code, questions and answers about problems and
> > solutions using R, enhancements and patches to the source code and
> > documentation of R, comparison and compatibility with S and S-plus, and
> for
> > the posting of nice examples and benchmarks.
> >
> >
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.