R 3.0.1 : parallel collection triggers "long memory not supported yet"

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

R 3.0.1 : parallel collection triggers "long memory not supported yet"

ivo welch-4
Dear R developers:

...
7: lapply(seq_len(cores), inner.do)
8: FUN(1:3[[3]], ...)
9: sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE))

Selection: .....................Error in sendMaster(try(lapply(X = S, FUN =
FUN, ...), silent = TRUE)) :
  long vectors not supported yet: memory.c:3100


admittedly, my outcome will be a very big list, with 30,000 elements, each
containing data frames with 14 variables and around 200 to 5000
observations (say, 64KB on average).  thus, I estimate that the resulting
list is 20GB.  the specific code that triggers this is


    exposures.list <- mclapply(1:length(crsp.list.by.permno),
                          FUN=function(i, NMO=NMO) {

calcbeta.for.one.stock(crsp.list.by.permno[[i]], NMO=NMO)
                          },
                          NMO=NMO, mc.cores=3 )

the release docs to 3.0.0 suggest this error should occur primarily in
unusual situations.  so, it's not really a bug.  but I thought I would
point this out.  maybe this is a forgotten updatedlet.

regards,

/iaw
----
Ivo Welch ([hidden email])

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: R 3.0.1 : parallel collection triggers "long memory not supported yet"

Simon Urbanek
On May 31, 2013, at 12:14 PM, ivo welch wrote:

> Dear R developers:
>
> ...
> 7: lapply(seq_len(cores), inner.do)
> 8: FUN(1:3[[3]], ...)
> 9: sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE))
>
> Selection: .....................Error in sendMaster(try(lapply(X = S, FUN =
> FUN, ...), silent = TRUE)) :
>  long vectors not supported yet: memory.c:3100
>
>
> admittedly, my outcome will be a very big list, with 30,000 elements, each
> containing data frames with 14 variables and around 200 to 5000
> observations (say, 64KB on average).  thus, I estimate that the resulting
> list is 20GB.  the specific code that triggers this is
>
>
>    exposures.list <- mclapply(1:length(crsp.list.by.permno),
>                          FUN=function(i, NMO=NMO) {
>
> calcbeta.for.one.stock(crsp.list.by.permno[[i]], NMO=NMO)
>                          },
>                          NMO=NMO, mc.cores=3 )
>
> the release docs to 3.0.0 suggest this error should occur primarily in
> unusual situations.  so, it's not really a bug.  but I thought I would
> point this out.  maybe this is a forgotten updatedlet.
>

mclapply uses sendMaster() to send the results (serialized into a raw vector) from the worker back to the parent R session. Apparently your serialized result from one worker is more than 2Gb. The multicore part of parallel currently doesn't support long vectors for the transmission so the result for one worker cannot exceed 2Gb. I'll put long vector support on my ToDo list. In your case you should be able to work around it by disabling pre-scheduling (you may want to do some grouping if you have 30,000 short iterations, though).

Cheers,
Simon

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel