Generating unordered, with replacement, samples

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Generating unordered, with replacement, samples

Giovanni Petris

Hello,

I am trying to interface in my teaching some elementary probability with Monte Carlo ideas. In sampling from a finite population, the number of distinct samples of size 'k' from a population of size 'n' , when individuals are selected with replacement and the selection order does not matter, is choose(n + k -1, k). Does anyone have a suggestion about how to simulate (uniformly!) one of these possible samples? In a Monte Carlo framework I would like to do it repeatedly, so efficiency is of some relevance.

Thank you in advance!

Best,
Giovanni



Giovanni Petris
Associate Professor
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Generating unordered, with replacement, samples

Rui Barradas
Hello,

Try function ?sample. Something like, if 'x' is a vector of size n,

sample(x, k, replace = TRUE)

If you want indices into 'x', try instead

sample(n, k, replace = TRUE)

Hope this helps,

Rui Barradas

Em 17-09-2014 19:25, Giovanni Petris escreveu:

>
> Hello,
>
> I am trying to interface in my teaching some elementary probability with Monte Carlo ideas. In sampling from a finite population, the number of distinct samples of size 'k' from a population of size 'n' , when individuals are selected with replacement and the selection order does not matter, is choose(n + k -1, k). Does anyone have a suggestion about how to simulate (uniformly!) one of these possible samples? In a Monte Carlo framework I would like to do it repeatedly, so efficiency is of some relevance.
>
> Thank you in advance!
>
> Best,
> Giovanni
>
>
>
> Giovanni Petris
> Associate Professor
> Department of Mathematical Sciences
> University of Arkansas - Fayetteville, AR 72701
> Ph: (479) 575-6324, 575-8630 (fax)
> http://definetti.uark.edu/~gpetris/
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Generating unordered, with replacement, samples

Giovanni Petris

Thank you, Rui, but I have the feeling that in this way you are drawing uniformly *ordered* samples, and you get a bias if you consider them to be unordered.

Best,
Giovanni

________________________________________
From: Rui Barradas [[hidden email]]
Sent: Wednesday, September 17, 2014 13:49
To: Giovanni Petris; [hidden email]
Subject: Re: [R] Generating unordered, with replacement, samples

Hello,

Try function ?sample. Something like, if 'x' is a vector of size n,

sample(x, k, replace = TRUE)

If you want indices into 'x', try instead

sample(n, k, replace = TRUE)

Hope this helps,

Rui Barradas

Em 17-09-2014 19:25, Giovanni Petris escreveu:

>
> Hello,
>
> I am trying to interface in my teaching some elementary probability with Monte Carlo ideas. In sampling from a finite population, the number of distinct samples of size 'k' from a population of size 'n' , when individuals are selected with replacement and the selection order does not matter, is choose(n + k -1, k). Does anyone have a suggestion about how to simulate (uniformly!) one of these possible samples? In a Monte Carlo framework I would like to do it repeatedly, so efficiency is of some relevance.
>
> Thank you in advance!
>
> Best,
> Giovanni
>
>
>
> Giovanni Petris
> Associate Professor
> Department of Mathematical Sciences
> University of Arkansas - Fayetteville, AR 72701
> Ph: (479) 575-6324, 575-8630 (fax)
> http://definetti.uark.edu/~gpetris/
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Generating unordered, with replacement, samples

Duncan Murdoch-2
In reply to this post by Giovanni Petris
On 17/09/2014 2:25 PM, Giovanni Petris wrote:
> Hello,
>
> I am trying to interface in my teaching some elementary probability with Monte Carlo ideas. In sampling from a finite population, the number of distinct samples of size 'k' from a population of size 'n' , when individuals are selected with replacement and the selection order does not matter, is choose(n + k -1, k). Does anyone have a suggestion about how to simulate (uniformly!) one of these possible samples? In a Monte Carlo framework I would like to do it repeatedly, so efficiency is of some relevance.
>
> Thank you in advance!

I forget the details of the derivation of that count, but the number
suggests it is found by selecting k things without replacement from
n+k-1.  The sample() function in R can easily give you a sample of k
integers from 1:(n+k-1); "all" you need to do is map those numbers into
your original sample of k from n.  For that you need to remember the
derivation of that formula!

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Generating unordered, with replacement, samples

Duncan Murdoch-2
On 17/09/2014 3:07 PM, Duncan Murdoch wrote:

> On 17/09/2014 2:25 PM, Giovanni Petris wrote:
> > Hello,
> >
> > I am trying to interface in my teaching some elementary probability with Monte Carlo ideas. In sampling from a finite population, the number of distinct samples of size 'k' from a population of size 'n' , when individuals are selected with replacement and the selection order does not matter, is choose(n + k -1, k). Does anyone have a suggestion about how to simulate (uniformly!) one of these possible samples? In a Monte Carlo framework I would like to do it repeatedly, so efficiency is of some relevance.
> >
> > Thank you in advance!
>
> I forget the details of the derivation of that count, but the number
> suggests it is found by selecting k things without replacement from
> n+k-1.  The sample() function in R can easily give you a sample of k
> integers from 1:(n+k-1); "all" you need to do is map those numbers into
> your original sample of k from n.  For that you need to remember the
> derivation of that formula!

The derivation is on this web page:
http://mathworld.wolfram.com/Multichoose.html .

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Generating unordered, with replacement, samples

Giovanni Petris
In reply to this post by Duncan Murdoch-2

Hi Duncan,

You are right. The idea of the derivation consists in 'throwing' k placeholders ("*" in the example below) in the list of the individuals of the population. For example, if the population is letters[1:6], and the sample size is 4, the following code generates uniformly a 'sample'.

> n <- 6; k <- 4
> set.seed(2)
> xxx <- rep("*", n + k)
> ind <- sort(sample(2 : (n+k), k))
> xxx[setdiff(1 : (n+k), ind)] <- letters[seq.int(n)]
> noquote(xxx)
 [1] a b * c d * * e f *

This represents the sample (b, d, d, f). I am still missing the "all" I need to do that you mention, that is how I can transform the vector xxx into something more readily usable, like c(b, d, d, f), or even a summary of counts. I guess I am looking for a bit of R trickery here...

Thank you,
Giovanni

________________________________________
From: Duncan Murdoch [[hidden email]]
Sent: Wednesday, September 17, 2014 14:07
To: Giovanni Petris; [hidden email]
Subject: Re: [R] Generating unordered, with replacement, samples

On 17/09/2014 2:25 PM, Giovanni Petris wrote:
> Hello,
>
> I am trying to interface in my teaching some elementary probability with Monte Carlo ideas. In sampling from a finite population, the number of distinct samples of size 'k' from a population of size 'n' , when individuals are selected with replacement and the selection order does not matter, is choose(n + k -1, k). Does anyone have a suggestion about how to simulate (uniformly!) one of these possible samples? In a Monte Carlo framework I would like to do it repeatedly, so efficiency is of some relevance.
>
> Thank you in advance!

I forget the details of the derivation of that count, but the number
suggests it is found by selecting k things without replacement from
n+k-1.  The sample() function in R can easily give you a sample of k
integers from 1:(n+k-1); "all" you need to do is map those numbers into
your original sample of k from n.  For that you need to remember the
derivation of that formula!

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Generating unordered, with replacement, samples

Duncan Murdoch-2
On 17/09/2014 3:46 PM, Giovanni Petris wrote:

> Hi Duncan,
>
> You are right. The idea of the derivation consists in 'throwing' k placeholders ("*" in the example below) in the list of the individuals of the population. For example, if the population is letters[1:6], and the sample size is 4, the following code generates uniformly a 'sample'.
>
> > n <- 6; k <- 4
> > set.seed(2)
> > xxx <- rep("*", n + k)
> > ind <- sort(sample(2 : (n+k), k))
> > xxx[setdiff(1 : (n+k), ind)] <- letters[seq.int(n)]
> > noquote(xxx)
>   [1] a b * c d * * e f *
>
> This represents the sample (b, d, d, f). I am still missing the "all" I need to do that you mention, that is how I can transform the vector xxx into something more readily usable, like c(b, d, d, f), or even a summary of counts. I guess I am looking for a bit of R trickery here...

I think this works, but you'd better check!

Sample the placeholders:

ind <- sort( sample(n + k -1, n-1) )  # I don't think sort() is necessary...

Add placeholders at the start and end:

ind <- c(0, ind, n+k)

Take the diffs, and subtract one:

diff(ind) - 1

I think this gives the counts you want.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Generating unordered, with replacement, samples

Giovanni Petris

Thank you!

That does exactly what I was looking for.

Best,
Giovanni

________________________________________
From: Duncan Murdoch [[hidden email]]
Sent: Wednesday, September 17, 2014 15:02
To: Giovanni Petris; [hidden email]
Subject: Re: [R] Generating unordered, with replacement, samples

On 17/09/2014 3:46 PM, Giovanni Petris wrote:

> Hi Duncan,
>
> You are right. The idea of the derivation consists in 'throwing' k placeholders ("*" in the example below) in the list of the individuals of the population. For example, if the population is letters[1:6], and the sample size is 4, the following code generates uniformly a 'sample'.
>
> > n <- 6; k <- 4
> > set.seed(2)
> > xxx <- rep("*", n + k)
> > ind <- sort(sample(2 : (n+k), k))
> > xxx[setdiff(1 : (n+k), ind)] <- letters[seq.int(n)]
> > noquote(xxx)
>   [1] a b * c d * * e f *
>
> This represents the sample (b, d, d, f). I am still missing the "all" I need to do that you mention, that is how I can transform the vector xxx into something more readily usable, like c(b, d, d, f), or even a summary of counts. I guess I am looking for a bit of R trickery here...

I think this works, but you'd better check!

Sample the placeholders:

ind <- sort( sample(n + k -1, n-1) )  # I don't think sort() is necessary...

Add placeholders at the start and end:

ind <- c(0, ind, n+k)

Take the diffs, and subtract one:

diff(ind) - 1

I think this gives the counts you want.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.