

Dear RHelp,
I created a mapply function to select samples from a dataset but are there
any faster ways to do it by avoiding mapply because it is slow and I have a
larger dataset? My goal is to use more matrix / vector operations and less
in terms of lists (the format of the output can be flexible). Ideally, I
would like to stick to base R methods without the aid of parallel process
or packages. Any ideas will be appreciated!
#A list of a set of data to be selected
bl < list(list(c(1, 2),c(2, 3), c(3, 4), c(4, 5), c(5, 6), c(6, 7), c(7,
8), c(8, 9)),
list(c(1, 2, 3), c(2, 3, 4), c(3, 4, 5), c(4, 5, 6), c(5, 6, 7),
c(6, 7, 8)),
list(c(1, 2, 3, 4, 5), c(2, 3, 4, 5, 6), c(3, 4, 5, 6, 7), c(4, 5,
6, 7, 8), c(5, 6, 7, 8, 9)))
#Number of elements to be selected
kn < c(5, 4, 3)
#Total number of elements in each set
nb < c(8, 6, 5)
#This output a list but preferably I would like a matrix
bl_func < function() mapply(function(x, y, z) {
x[sample.int(y, z, replace = TRUE)]
}, bl, nb, kn, SIMPLIFY = FALSE)
Best,
Chao
Hi
I am not sure if I understand your function but simple mapply gives you probably the same result and may be quicker.
> set.seed(111)
> blf < bl_func()
> set.seed(111)
> blm < mapply(sample, bl, kn, replace=TRUE)
> all.equal(blf, blm)
[1] TRUE
> Cheers
Petr
Chao
Thank you for your input, Petr. I will give it a try.
Best,
Chao
On Wed, Feb 3, 2021 at 4:32 AM PIKAL Petr < [hidden email]> wrote:
> Hi
>
> I am not sure if I understand your function but simple mapply gives you
> probably the same result and may be quicker.
>
> > set.seed(111)
> > blf < bl_func()
> > set.seed(111)
> > blm < mapply(sample, bl, kn, replace=TRUE)
> > all.equal(blf, blm)
> [1] TRUE
>
> > Cheers
> Petr
>
Chao
>
