random sampling with some limitive conditions?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

random sampling with some limitive conditions?

Zhang Jian
I want to gain thousands of random sampling data by randomizing the
presence-absence data. Meantime, one important limition is that the row and
column sums must be fixed. For example, the data "tst" is following:
   site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0
1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 1 0 1 0 1

sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the
first row sums must equal to 3, and the first column sums must equal to 4.
The rules need to be applied to each row and column.
How to get the new random sampling data? I have no idea.
Thanks.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: random sampling with some limitive conditions?

Daniel Nordlund
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]]
> On Behalf Of Zhang Jian
> Sent: Saturday, July 07, 2007 12:31 PM
> To: r-help
> Subject: [R] random sampling with some limitive conditions?
>
> I want to gain thousands of random sampling data by randomizing the
> presence-absence data. Meantime, one important limition is that the row and
> column sums must be fixed. For example, the data "tst" is following:
>    site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0
> 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0
> 0 0 0 0 0 0 0 0 1 0 1 0 1
>
> sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the
> first row sums must equal to 3, and the first column sums must equal to 4.
> The rules need to be applied to each row and column.
> How to get the new random sampling data? I have no idea.
> Thanks.
>

You could reorder your table by stepping through your table a column at a time, and for each column randomly deciding to swap the current column with a column that has the same column total.  Repeat this process for each row, i.e. for each row, randomly choose a row with the same row total to swap with.

Here is some example code which is neither efficient nor general, but does demonstrate the basic idea.  You will need to decide if this approach meets you needs.

# I created a data file with your table (8x8) and read from it
sites <- read.table("c:/R/R-examples/site_random_sample.txt", header=TRUE)
sites
# get row and column totals
colsums <- apply(sites,2,sum)
rowsums <- apply(sites,1,sum)
# randomly swap columns
for(i in 1:8) {
  if (runif(1) > .5) {
    swapcol<-sample(which(colsums==colsums[i]),1)
    temp<-sites[,swapcol]
    sites[,swapcol]<-sites[,i]
    sites[,i]<-temp
    }
  }
# randomly swap rows
for(i in 1:8) {
  if (runif(1) > .5) {
    swaprow<-sample(which(rowsums==rowsums[i]),1)
    temp<-sites[swaprow,]
    sites[swaprow,]<-sites[i,]
    sites[i,]<-temp
    }
  }
sites
   

Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: random sampling with some limitive conditions?

Zhang Jian
The method can get one new data. But I think that it is not random. I used
the new "random" data to compute the index which I want to get. The
same value was achieved with the data "sites".
I try it again and again. The result is the same.
So I think I need to find one new random sampling method.



On 7/7/07, Daniel Nordlund <[hidden email]> wrote:

>
> > -----Original Message-----
> > From: [hidden email] [mailto:
> [hidden email]]
> > On Behalf Of Zhang Jian
> > Sent: Saturday, July 07, 2007 12:31 PM
> > To: r-help
> > Subject: [R] random sampling with some limitive conditions?
> >
> > I want to gain thousands of random sampling data by randomizing the
> > presence-absence data. Meantime, one important limition is that the row
> and
> > column sums must be fixed. For example, the data "tst" is following:
> >    site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1
> 1 0
> > 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1
> 0 0
> > 0 0 0 0 0 0 0 0 1 0 1 0 1
> >
> > sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data,
> the
> > first row sums must equal to 3, and the first column sums must equal to
> 4.
> > The rules need to be applied to each row and column.
> > How to get the new random sampling data? I have no idea.
> > Thanks.
> >
>
> You could reorder your table by stepping through your table a column at a
> time, and for each column randomly deciding to swap the current column with
> a column that has the same column total.  Repeat this process for each row,
> i.e. for each row, randomly choose a row with the same row total to swap
> with.
>
> Here is some example code which is neither efficient nor general, but does
> demonstrate the basic idea.  You will need to decide if this approach meets
> you needs.
>
> # I created a data file with your table (8x8) and read from it
> sites <- read.table("c:/R/R-examples/site_random_sample.txt", header=TRUE)
> sites
> # get row and column totals
> colsums <- apply(sites,2,sum)
> rowsums <- apply(sites,1,sum)
> # randomly swap columns
> for(i in 1:8) {
> if (runif(1) > .5) {
>    swapcol<-sample(which(colsums==colsums[i]),1)
>    temp<-sites[,swapcol]
>    sites[,swapcol]<-sites[,i]
>    sites[,i]<-temp
>    }
> }
> # randomly swap rows
> for(i in 1:8) {
> if (runif(1) > .5) {
>    swaprow<-sample(which(rowsums==rowsums[i]),1)
>    temp<-sites[swaprow,]
>    sites[swaprow,]<-sites[i,]
>    sites[i,]<-temp
>    }
> }
> sites
>
>
> Hope this is helpful,
>
> Dan
>
> Daniel Nordlund
> Bothell, WA USA
>
>
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: random sampling with some limitive conditions?

Zhang Jian
Any methods or advices about the random sampling method?
I have no idea.
Thanks a lot.


On 7/8/07, Zhang Jian <[hidden email]> wrote:

>
> The method can get one new data. But I think that it is not random. I used
> the new "random" data to compute the index which I want to get. The
> same value was achieved with the data "sites".
> I try it again and again. The result is the same.
> So I think I need to find one new random sampling method.
>
>
>
> On 7/7/07, Daniel Nordlund <[hidden email]> wrote:
> >
> > > -----Original Message-----
> > > From: [hidden email] [mailto:
> > [hidden email]]
> > > On Behalf Of Zhang Jian
> > > Sent: Saturday, July 07, 2007 12:31 PM
> > > To: r-help
> > > Subject: [R] random sampling with some limitive conditions?
> > >
> > > I want to gain thousands of random sampling data by randomizing the
> > > presence-absence data. Meantime, one important limition is that the
> > row and
> > > column sums must be fixed. For example, the data "tst" is following:
> > >    site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1
> > 1 1 0
> > > 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1
> > 1 0 0
> > > 0 0 0 0 0 0 0 0 1 0 1 0 1
> > >
> > > sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the
> > data, the
> > > first row sums must equal to 3, and the first column sums must equal
> > to 4.
> > > The rules need to be applied to each row and column.
> > > How to get the new random sampling data? I have no idea.
> > > Thanks.
> > >
> >
> > You could reorder your table by stepping through your table a column at
> > a time, and for each column randomly deciding to swap the current column
> > with a column that has the same column total.  Repeat this process for each
> > row, i.e. for each row, randomly choose a row with the same row total to
> > swap with.
> >
> > Here is some example code which is neither efficient nor general, but
> > does demonstrate the basic idea.  You will need to decide if this approach
> > meets you needs.
> >
> > # I created a data file with your table (8x8) and read from it
> > sites <- read.table("c:/R/R-examples/site_random_sample.txt",
> > header=TRUE)
> > sites
> > # get row and column totals
> > colsums <- apply(sites,2,sum)
> > rowsums <- apply(sites,1,sum)
> > # randomly swap columns
> > for(i in 1:8) {
> > if (runif(1) > .5) {
> >    swapcol<-sample(which(colsums==colsums[i]),1)
> >    temp<-sites[,swapcol]
> >    sites[,swapcol]<-sites[,i]
> >    sites[,i]<-temp
> >    }
> > }
> > # randomly swap rows
> > for(i in 1:8) {
> > if (runif(1) > .5) {
> >    swaprow<-sample(which(rowsums==rowsums[i]),1)
> >    temp<-sites[swaprow,]
> >    sites[swaprow,]<-sites[i,]
> >    sites[i,]<-temp
> >    }
> > }
> > sites
> >
> >
> > Hope this is helpful,
> >
> > Dan
> >
> > Daniel Nordlund
> > Bothell, WA USA
> >
> >
> >
> >
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.