

Dear RHelp,
I created a mapply function to select samples from a dataset but are there
any faster ways to do it by avoiding mapply because it is slow and I have a
larger dataset? My goal is to use more matrix / vector operations and less
in terms of lists (the format of the output can be flexible). Ideally, I
would like to stick to base R methods without the aid of parallel process
or packages. Any ideas will be appreciated!
#A list of a set of data to be selected
bl < list(list(c(1, 2),c(2, 3), c(3, 4), c(4, 5), c(5, 6), c(6, 7), c(7,
8), c(8, 9)),
list(c(1, 2, 3), c(2, 3, 4), c(3, 4, 5), c(4, 5, 6), c(5, 6, 7),
c(6, 7, 8)),
list(c(1, 2, 3, 4, 5), c(2, 3, 4, 5, 6), c(3, 4, 5, 6, 7), c(4, 5,
6, 7, 8), c(5, 6, 7, 8, 9)))
#Number of elements to be selected
kn < c(5, 4, 3)
#Total number of elements in each set
nb < c(8, 6, 5)
#This output a list but preferably I would like a matrix
bl_func < function() mapply(function(x, y, z) {
x[sample.int(y, z, replace = TRUE)]
}, bl, nb, kn, SIMPLIFY = FALSE)
Best,
Chao
ᐧ
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hi
I am not sure if I understand your function but simple mapply gives you probably the same result and may be quicker.
> set.seed(111)
> blf < bl_func()
> set.seed(111)
> blm < mapply(sample, bl, kn, replace=TRUE)
> all.equal(blf, blm)
[1] TRUE
> Cheers
Petr
> Original Message
> From: Rhelp < [hidden email]> On Behalf Of Chao Liu
> Sent: Tuesday, February 2, 2021 3:32 PM
> To: rhelp < [hidden email]>
> Subject: [R] Alternative to mapply to select samples
>
> Dear RHelp,
>
> I created a mapply function to select samples from a dataset but are there
> any faster ways to do it by avoiding mapply because it is slow and I have a
> larger dataset? My goal is to use more matrix / vector operations and less
> in terms of lists (the format of the output can be flexible). Ideally, I
> would like to stick to base R methods without the aid of parallel process
> or packages. Any ideas will be appreciated!
>
> #A list of a set of data to be selected
> bl < list(list(c(1, 2),c(2, 3), c(3, 4), c(4, 5), c(5, 6), c(6, 7), c(7,
> 8), c(8, 9)),
> list(c(1, 2, 3), c(2, 3, 4), c(3, 4, 5), c(4, 5, 6), c(5, 6, 7),
> c(6, 7, 8)),
> list(c(1, 2, 3, 4, 5), c(2, 3, 4, 5, 6), c(3, 4, 5, 6, 7), c(4, 5,
> 6, 7, 8), c(5, 6, 7, 8, 9)))
> #Number of elements to be selected
> kn < c(5, 4, 3)
> #Total number of elements in each set
> nb < c(8, 6, 5)
> #This output a list but preferably I would like a matrix
> bl_func < function() mapply(function(x, y, z) {
> x[sample.int(y, z, replace = TRUE)]
> }, bl, nb, kn, SIMPLIFY = FALSE)
>
> Best,
>
> Chao
> ᐧ
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/posting> guide.html
> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Thank you for your input, Petr. I will give it a try.
Best,
Chao
ᐧ
On Wed, Feb 3, 2021 at 4:32 AM PIKAL Petr < [hidden email]> wrote:
> Hi
>
> I am not sure if I understand your function but simple mapply gives you
> probably the same result and may be quicker.
>
> > set.seed(111)
> > blf < bl_func()
> > set.seed(111)
> > blm < mapply(sample, bl, kn, replace=TRUE)
> > all.equal(blf, blm)
> [1] TRUE
>
> > Cheers
> Petr
>
> > Original Message
> > From: Rhelp < [hidden email]> On Behalf Of Chao Liu
> > Sent: Tuesday, February 2, 2021 3:32 PM
> > To: rhelp < [hidden email]>
> > Subject: [R] Alternative to mapply to select samples
> >
> > Dear RHelp,
> >
> > I created a mapply function to select samples from a dataset but are
> there
> > any faster ways to do it by avoiding mapply because it is slow and I
> have a
> > larger dataset? My goal is to use more matrix / vector operations and
> less
> > in terms of lists (the format of the output can be flexible). Ideally, I
> > would like to stick to base R methods without the aid of parallel process
> > or packages. Any ideas will be appreciated!
> >
> > #A list of a set of data to be selected
> > bl < list(list(c(1, 2),c(2, 3), c(3, 4), c(4, 5), c(5, 6), c(6, 7), c(7,
> > 8), c(8, 9)),
> > list(c(1, 2, 3), c(2, 3, 4), c(3, 4, 5), c(4, 5, 6), c(5, 6, 7),
> > c(6, 7, 8)),
> > list(c(1, 2, 3, 4, 5), c(2, 3, 4, 5, 6), c(3, 4, 5, 6, 7), c(4,
> 5,
> > 6, 7, 8), c(5, 6, 7, 8, 9)))
> > #Number of elements to be selected
> > kn < c(5, 4, 3)
> > #Total number of elements in each set
> > nb < c(8, 6, 5)
> > #This output a list but preferably I would like a matrix
> > bl_func < function() mapply(function(x, y, z) {
> > x[sample.int(y, z, replace = TRUE)]
> > }, bl, nb, kn, SIMPLIFY = FALSE)
> >
> > Best,
> >
> > Chao
> > ᐧ
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list  To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/rhelp> > PLEASE do read the posting guide http://www.Rproject.org/posting> > guide.html
> > and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

