

Hello,
I am trying to interface in my teaching some elementary probability with Monte Carlo ideas. In sampling from a finite population, the number of distinct samples of size 'k' from a population of size 'n' , when individuals are selected with replacement and the selection order does not matter, is choose(n + k 1, k). Does anyone have a suggestion about how to simulate (uniformly!) one of these possible samples? In a Monte Carlo framework I would like to do it repeatedly, so efficiency is of some relevance.
Thank you in advance!
Best,
Giovanni
Giovanni Petris
Associate Professor
Department of Mathematical Sciences
University of Arkansas  Fayetteville, AR 72701
Ph: (479) 5756324, 5758630 (fax)
http://definetti.uark.edu/~gpetris/______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hello,
Try function ?sample. Something like, if 'x' is a vector of size n,
sample(x, k, replace = TRUE)
If you want indices into 'x', try instead
sample(n, k, replace = TRUE)
Hope this helps,
Rui Barradas
Em 17092014 19:25, Giovanni Petris escreveu:
>
> Hello,
>
> I am trying to interface in my teaching some elementary probability with Monte Carlo ideas. In sampling from a finite population, the number of distinct samples of size 'k' from a population of size 'n' , when individuals are selected with replacement and the selection order does not matter, is choose(n + k 1, k). Does anyone have a suggestion about how to simulate (uniformly!) one of these possible samples? In a Monte Carlo framework I would like to do it repeatedly, so efficiency is of some relevance.
>
> Thank you in advance!
>
> Best,
> Giovanni
>
>
>
> Giovanni Petris
> Associate Professor
> Department of Mathematical Sciences
> University of Arkansas  Fayetteville, AR 72701
> Ph: (479) 5756324, 5758630 (fax)
> http://definetti.uark.edu/~gpetris/>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Thank you, Rui, but I have the feeling that in this way you are drawing uniformly *ordered* samples, and you get a bias if you consider them to be unordered.
Best,
Giovanni
________________________________________
From: Rui Barradas [ [hidden email]]
Sent: Wednesday, September 17, 2014 13:49
To: Giovanni Petris; [hidden email]
Subject: Re: [R] Generating unordered, with replacement, samples
Hello,
Try function ?sample. Something like, if 'x' is a vector of size n,
sample(x, k, replace = TRUE)
If you want indices into 'x', try instead
sample(n, k, replace = TRUE)
Hope this helps,
Rui Barradas
Em 17092014 19:25, Giovanni Petris escreveu:
>
> Hello,
>
> I am trying to interface in my teaching some elementary probability with Monte Carlo ideas. In sampling from a finite population, the number of distinct samples of size 'k' from a population of size 'n' , when individuals are selected with replacement and the selection order does not matter, is choose(n + k 1, k). Does anyone have a suggestion about how to simulate (uniformly!) one of these possible samples? In a Monte Carlo framework I would like to do it repeatedly, so efficiency is of some relevance.
>
> Thank you in advance!
>
> Best,
> Giovanni
>
>
>
> Giovanni Petris
> Associate Professor
> Department of Mathematical Sciences
> University of Arkansas  Fayetteville, AR 72701
> Ph: (479) 5756324, 5758630 (fax)
> http://definetti.uark.edu/~gpetris/>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 17/09/2014 2:25 PM, Giovanni Petris wrote:
> Hello,
>
> I am trying to interface in my teaching some elementary probability with Monte Carlo ideas. In sampling from a finite population, the number of distinct samples of size 'k' from a population of size 'n' , when individuals are selected with replacement and the selection order does not matter, is choose(n + k 1, k). Does anyone have a suggestion about how to simulate (uniformly!) one of these possible samples? In a Monte Carlo framework I would like to do it repeatedly, so efficiency is of some relevance.
>
> Thank you in advance!
I forget the details of the derivation of that count, but the number
suggests it is found by selecting k things without replacement from
n+k1. The sample() function in R can easily give you a sample of k
integers from 1:(n+k1); "all" you need to do is map those numbers into
your original sample of k from n. For that you need to remember the
derivation of that formula!
Duncan Murdoch
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 17/09/2014 3:07 PM, Duncan Murdoch wrote:
> On 17/09/2014 2:25 PM, Giovanni Petris wrote:
> > Hello,
> >
> > I am trying to interface in my teaching some elementary probability with Monte Carlo ideas. In sampling from a finite population, the number of distinct samples of size 'k' from a population of size 'n' , when individuals are selected with replacement and the selection order does not matter, is choose(n + k 1, k). Does anyone have a suggestion about how to simulate (uniformly!) one of these possible samples? In a Monte Carlo framework I would like to do it repeatedly, so efficiency is of some relevance.
> >
> > Thank you in advance!
>
> I forget the details of the derivation of that count, but the number
> suggests it is found by selecting k things without replacement from
> n+k1. The sample() function in R can easily give you a sample of k
> integers from 1:(n+k1); "all" you need to do is map those numbers into
> your original sample of k from n. For that you need to remember the
> derivation of that formula!
The derivation is on this web page:
http://mathworld.wolfram.com/Multichoose.html .
Duncan Murdoch
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hi Duncan,
You are right. The idea of the derivation consists in 'throwing' k placeholders ("*" in the example below) in the list of the individuals of the population. For example, if the population is letters[1:6], and the sample size is 4, the following code generates uniformly a 'sample'.
> n < 6; k < 4
> set.seed(2)
> xxx < rep("*", n + k)
> ind < sort(sample(2 : (n+k), k))
> xxx[setdiff(1 : (n+k), ind)] < letters[seq.int(n)]
> noquote(xxx)
[1] a b * c d * * e f *
This represents the sample (b, d, d, f). I am still missing the "all" I need to do that you mention, that is how I can transform the vector xxx into something more readily usable, like c(b, d, d, f), or even a summary of counts. I guess I am looking for a bit of R trickery here...
Thank you,
Giovanni
________________________________________
From: Duncan Murdoch [ [hidden email]]
Sent: Wednesday, September 17, 2014 14:07
To: Giovanni Petris; [hidden email]
Subject: Re: [R] Generating unordered, with replacement, samples
On 17/09/2014 2:25 PM, Giovanni Petris wrote:
> Hello,
>
> I am trying to interface in my teaching some elementary probability with Monte Carlo ideas. In sampling from a finite population, the number of distinct samples of size 'k' from a population of size 'n' , when individuals are selected with replacement and the selection order does not matter, is choose(n + k 1, k). Does anyone have a suggestion about how to simulate (uniformly!) one of these possible samples? In a Monte Carlo framework I would like to do it repeatedly, so efficiency is of some relevance.
>
> Thank you in advance!
I forget the details of the derivation of that count, but the number
suggests it is found by selecting k things without replacement from
n+k1. The sample() function in R can easily give you a sample of k
integers from 1:(n+k1); "all" you need to do is map those numbers into
your original sample of k from n. For that you need to remember the
derivation of that formula!
Duncan Murdoch
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 17/09/2014 3:46 PM, Giovanni Petris wrote:
> Hi Duncan,
>
> You are right. The idea of the derivation consists in 'throwing' k placeholders ("*" in the example below) in the list of the individuals of the population. For example, if the population is letters[1:6], and the sample size is 4, the following code generates uniformly a 'sample'.
>
> > n < 6; k < 4
> > set.seed(2)
> > xxx < rep("*", n + k)
> > ind < sort(sample(2 : (n+k), k))
> > xxx[setdiff(1 : (n+k), ind)] < letters[seq.int(n)]
> > noquote(xxx)
> [1] a b * c d * * e f *
>
> This represents the sample (b, d, d, f). I am still missing the "all" I need to do that you mention, that is how I can transform the vector xxx into something more readily usable, like c(b, d, d, f), or even a summary of counts. I guess I am looking for a bit of R trickery here...
I think this works, but you'd better check!
Sample the placeholders:
ind < sort( sample(n + k 1, n1) ) # I don't think sort() is necessary...
Add placeholders at the start and end:
ind < c(0, ind, n+k)
Take the diffs, and subtract one:
diff(ind)  1
I think this gives the counts you want.
Duncan Murdoch
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Thank you!
That does exactly what I was looking for.
Best,
Giovanni
________________________________________
From: Duncan Murdoch [ [hidden email]]
Sent: Wednesday, September 17, 2014 15:02
To: Giovanni Petris; [hidden email]
Subject: Re: [R] Generating unordered, with replacement, samples
On 17/09/2014 3:46 PM, Giovanni Petris wrote:
> Hi Duncan,
>
> You are right. The idea of the derivation consists in 'throwing' k placeholders ("*" in the example below) in the list of the individuals of the population. For example, if the population is letters[1:6], and the sample size is 4, the following code generates uniformly a 'sample'.
>
> > n < 6; k < 4
> > set.seed(2)
> > xxx < rep("*", n + k)
> > ind < sort(sample(2 : (n+k), k))
> > xxx[setdiff(1 : (n+k), ind)] < letters[seq.int(n)]
> > noquote(xxx)
> [1] a b * c d * * e f *
>
> This represents the sample (b, d, d, f). I am still missing the "all" I need to do that you mention, that is how I can transform the vector xxx into something more readily usable, like c(b, d, d, f), or even a summary of counts. I guess I am looking for a bit of R trickery here...
I think this works, but you'd better check!
Sample the placeholders:
ind < sort( sample(n + k 1, n1) ) # I don't think sort() is necessary...
Add placeholders at the start and end:
ind < c(0, ind, n+k)
Take the diffs, and subtract one:
diff(ind)  1
I think this gives the counts you want.
Duncan Murdoch
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

