Interpreting R memory profiling statistics from Rprof() and gc()

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Interpreting R memory profiling statistics from Rprof() and gc()

Joy-2
Sorry, this might be a really basic question, but I'm trying to interpret
the results from memory profiling, and I have a few questions (marked by
*Q#*).

From the summaryRprof() documentation, it seems that the four columns of
statistics that are reported when setting memory.profiling=TRUE are
- vector memory in small blocks on the R heap
- vector memory in large blocks (from malloc)
- memory in nodes on the R heap
- number of calls to the internal function duplicate in the time interval
(*Q1:* Are the units of the first 3 stats in bytes?)

and from the gc() documentation, the two rows represent
- ‘"Ncells"’ (_cons cells_), usually 28 bytes each on 32-bit systems and 56
bytes on 64-bit systems,
- ‘"Vcells"’ (_vector cells_, 8 bytes each)
(*Q2:* how are Ncells and Vcells related to small heap/large heap/memory in
nodes?)

And I guess the question that lead to these other questions is - *Q3:* I'd
like to plot out the total amount of memory used over time, and I don't
think Rprofmem() give me what I'd like to know because, as I'm
understanding it, Rprofmem() records the amount of memory allocated with
each call, but this doesn't tell me the total amount of memory R is using,
or am I mistaken?

Thanks in advance!

Joy

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Interpreting R memory profiling statistics from Rprof() and gc()

Tomas Kalibera
On 05/18/2017 06:54 PM, Joy wrote:

> Sorry, this might be a really basic question, but I'm trying to interpret
> the results from memory profiling, and I have a few questions (marked by
> *Q#*).
>
>  From the summaryRprof() documentation, it seems that the four columns of
> statistics that are reported when setting memory.profiling=TRUE are
> - vector memory in small blocks on the R heap
> - vector memory in large blocks (from malloc)
> - memory in nodes on the R heap
> - number of calls to the internal function duplicate in the time interval
> (*Q1:* Are the units of the first 3 stats in bytes?)
In Rprof.out, vector memory in small and large blocks is given in 8-byte
units (for historical reasons), but memory in nodes is given in bytes -
this is not documented/guaranteed in documentation. In
summaryRprof(memory="both"), memory usage is given in megabytes as
documented.
For summaryRprof(memory="stats" and memory="tseries") I clarified in
r72743, now memory usage is in bytes and it is documented.
>
> and from the gc() documentation, the two rows represent
> - ‘"Ncells"’ (_cons cells_), usually 28 bytes each on 32-bit systems and 56
> bytes on 64-bit systems,
> - ‘"Vcells"’ (_vector cells_, 8 bytes each)
> (*Q2:* how are Ncells and Vcells related to small heap/large heap/memory in
> nodes?)
Ncells describe memory in nodes (Ncells is the number of nodes).

Vcells describe memory in "small heap" + "large heap". A Vcell today
does not have much meaning, it is shown for historical reasons, but the
interesting thing is that Vcells*56 (or 28 on 32-bit systems) gives the
number of bytes in "small heap"+"large heap" objects.

> And I guess the question that lead to these other questions is - *Q3:* I'd
> like to plot out the total amount of memory used over time, and I don't
> think Rprofmem() give me what I'd like to know because, as I'm
> understanding it, Rprofmem() records the amount of memory allocated with
> each call, but this doesn't tell me the total amount of memory R is using,
> or am I mistaken?
Rprof controls a sampling profiler which regularly asks the GC how much
memory is currently in use on the R heap (but beware, indeed some of
that memory is no longer reachable but has not yet been collected -
running gc more frequently helps, and some of the memory may still be
reachable but will not be used anymore). You can get this data by
summaryRprof(memory="tseries") and plot them - add columns 1+2 or 1+2+3
depending on what you want, in 72743 or more recent, in older version
you need to multiply columns 1 and 2 by 8. To run the GC more frequently
you can use gctorture.

Or if you are happy modifying your own R code and you don't insist on
querying the memory size very frequently, you can also explicitly call
gc(verbose=T) repeatedly. For this you won't need to use the profiler.

If you were looking instead at how much memory the whole R instance was
using (that is, including memory allocated by the R gc but not presently
used for R objects, including memory outside R heap), the easiest way
would be to use facilities of your OS.

Rprofmem is a different thing and won't help you.

Best
Tomas

>
> Thanks in advance!
>
> Joy
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel