parallel number of cores according to memory?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

parallel number of cores according to memory?

ivo welch-2
if I understand correctly, R makes a copy of the full environment for each
process.  thus, even if I have 32 processors, if I only have 64GB of RAM
and my R process holds about 10GB, I should probably not spawn 32 processes.

has anyone written a function that sets the number of cores for use (in
mclapply) to be guessed at by appropriate memory requirements (e.g.,
"amount-of-RAM"/"RAM held by R")?

(it would be even nicer if I could declare my 8GB data frame to be
read-only and to be shared among my processes, but this is presumably
technically very difficult.)

pointers appreciated.

/iaw

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: parallel number of cores according to memory?

Jeff Newmiller
Use an operating system that supports forking, like Linux or MacOSX, and use the parallel package mclapply function or similar to share memory for read operations. [1]

And stop posting in HTML here.

[1] https://cran.r-project.org/web/views/HighPerformanceComputing.html

On July 7, 2020 9:20:39 PM PDT, ivo welch <[hidden email]> wrote:

>if I understand correctly, R makes a copy of the full environment for
>each
>process.  thus, even if I have 32 processors, if I only have 64GB of
>RAM
>and my R process holds about 10GB, I should probably not spawn 32
>processes.
>
>has anyone written a function that sets the number of cores for use (in
>mclapply) to be guessed at by appropriate memory requirements (e.g.,
>"amount-of-RAM"/"RAM held by R")?
>
>(it would be even nicer if I could declare my 8GB data frame to be
>read-only and to be shared among my processes, but this is presumably
>technically very difficult.)
>
>pointers appreciated.
>
>/iaw
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: parallel number of cores according to memory?

ivo welch-2
ugghhh---apologies.  although in 2020, it would be nice if the mailing
list had an automatic html filter (or even bouncer!)

I am using macos.  alas, my experiments suggest that `mclapply()` on a
32-core intel system with 64GB of RAM, where the input data frame is
8GB and the output is about 500MB per core (to be stitched together
into about 16GB), the system starts swapping like crazy, comes to a
halt, and then usually crashes.

/iaw

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: parallel number of cores according to memory?

Jeff Newmiller
The list does strip html, but the quality of what remains varies greatly.

Are you using tidyverse functions in your workers? Sounds to me like you are doing something in the workers that is triggering making copies of the input data frame.

On July 7, 2020 10:15:15 PM PDT, ivo welch <[hidden email]> wrote:

>ugghhh---apologies.  although in 2020, it would be nice if the mailing
>list had an automatic html filter (or even bouncer!)
>
>I am using macos.  alas, my experiments suggest that `mclapply()` on a
>32-core intel system with 64GB of RAM, where the input data frame is
>8GB and the output is about 500MB per core (to be stitched together
>into about 16GB), the system starts swapping like crazy, comes to a
>halt, and then usually crashes.
>
>/iaw

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: parallel number of cores according to memory?

ivo welch-5
no, I'm not.   mostly conventional use afaik.  if this should not be
happening, I can trace it down to a small reproducible example to
figure it out.

--
Ivo Welch ([hidden email])

--
Ivo Welch ([hidden email])
http://www.ivo-welch.info/
J. Fred Weston Distinguished Professor of Finance, UCLA Anderson



On Tue, Jul 7, 2020 at 10:24 PM Jeff Newmiller <[hidden email]> wrote:

>
> The list does strip html, but the quality of what remains varies greatly.
>
> Are you using tidyverse functions in your workers? Sounds to me like you are doing something in the workers that is triggering making copies of the input data frame.
>
> On July 7, 2020 10:15:15 PM PDT, ivo welch <[hidden email]> wrote:
> >ugghhh---apologies.  although in 2020, it would be nice if the mailing
> >list had an automatic html filter (or even bouncer!)
> >
> >I am using macos.  alas, my experiments suggest that `mclapply()` on a
> >32-core intel system with 64GB of RAM, where the input data frame is
> >8GB and the output is about 500MB per core (to be stitched together
> >into about 16GB), the system starts swapping like crazy, comes to a
> >halt, and then usually crashes.
> >
> >/iaw
>
> --
> Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.