Why does a 2 GB RData file exceed my 16GB memory limit when reading it in?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Why does a 2 GB RData file exceed my 16GB memory limit when reading it in?

David Jones
I ran a number of analyses in R and saved the workspace, which
resulted in a 2GB .RData file. When I try to read the file back into R
later, it won't read into R and provides the error: "Error: cannot
allocate vector of size 37 Kb"

This error comes after 1 minute of trying to read things in - I
presume a single vector sends it over the memory limit. But,
memory.limit() shows that I have access to a full 16gb of ram on my
machine (12 GB are free when I try to load the RData file).

gc() shows the following after I receive this error:

used (Mb) gc trigger (Mb) max used (Mb)
Ncells 623130 33.3 4134347 220.8 5715387 305.3
Vcells 1535682 11.8 883084810 6737.5 2100594002 16026.3

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why does a 2 GB RData file exceed my 16GB memory limit when reading it in?

Uwe Ligges-3


On 02.09.2020 04:44, David Jones wrote:
> I ran a number of analyses in R and saved the workspace, which
> resulted in a 2GB .RData file. When I try to read the file back into R

Compressed in RData but uncompressed in main memory....


> later, it won't read into R and provides the error: "Error: cannot
> allocate vector of size 37 Kb"
>
> This error comes after 1 minute of trying to read things in - I
> presume a single vector sends it over the memory limit. But,
> memory.limit() shows that I have access to a full 16gb of ram on my
> machine (12 GB are free when I try to load the RData file).

But the data may need more....


> gc() shows the following after I receive this error:
>
> used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 623130 33.3 4134347 220.8 5715387 305.3
> Vcells 1535682 11.8 883084810 6737.5 2100594002 16026.3

So 16GB were used when R gave up.

Best,
Uwe Ligges



> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why does a 2 GB RData file exceed my 16GB memory limit when reading it in?

R help mailing list-2
On Wed, 2 Sep 2020 13:36:43 +0200
Uwe Ligges <[hidden email]> wrote:

> On 02.09.2020 04:44, David Jones wrote:
> > I ran a number of analyses in R and saved the workspace, which
> > resulted in a 2GB .RData file. When I try to read the file back
> > into R  
>
> Compressed in RData but uncompressed in main memory....
>
>
> > later, it won't read into R and provides the error: "Error: cannot
> > allocate vector of size 37 Kb"
> >
> > This error comes after 1 minute of trying to read things in - I
> > presume a single vector sends it over the memory limit. But,
> > memory.limit() shows that I have access to a full 16gb of ram on my
> > machine (12 GB are free when I try to load the RData file).  
>
> But the data may need more....
>
>
> > gc() shows the following after I receive this error:
> >
> > used (Mb) gc trigger (Mb) max used (Mb)
> > Ncells 623130 33.3 4134347 220.8 5715387 305.3
> > Vcells 1535682 11.8 883084810 6737.5 2100594002 16026.3  
>
> So 16GB were used when R gave up.
>
> Best,
> Uwe Ligges

For my own part, looking at the OP's question, it does seem curious
that R could write that .RData file, but on the same system not be able
to reload something it created.  How would that work.  Wouldn't the
memory limit have been exceeded BEFORE the the .RData file was written
the FIRST time?

JDougherty

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why does a 2 GB RData file exceed my 16GB memory limit when reading it in?

Bert Gunter-2
R experts may give you a detailed explanation, but it is certainly possible
that the memory available to R when it wrote the file was different than
when it tried to read it, is it not?

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Sep 2, 2020 at 1:27 PM John via R-help <[hidden email]> wrote:

> On Wed, 2 Sep 2020 13:36:43 +0200
> Uwe Ligges <[hidden email]> wrote:
>
> > On 02.09.2020 04:44, David Jones wrote:
> > > I ran a number of analyses in R and saved the workspace, which
> > > resulted in a 2GB .RData file. When I try to read the file back
> > > into R
> >
> > Compressed in RData but uncompressed in main memory....
> >
> >
> > > later, it won't read into R and provides the error: "Error: cannot
> > > allocate vector of size 37 Kb"
> > >
> > > This error comes after 1 minute of trying to read things in - I
> > > presume a single vector sends it over the memory limit. But,
> > > memory.limit() shows that I have access to a full 16gb of ram on my
> > > machine (12 GB are free when I try to load the RData file).
> >
> > But the data may need more....
> >
> >
> > > gc() shows the following after I receive this error:
> > >
> > > used (Mb) gc trigger (Mb) max used (Mb)
> > > Ncells 623130 33.3 4134347 220.8 5715387 305.3
> > > Vcells 1535682 11.8 883084810 6737.5 2100594002 16026.3
> >
> > So 16GB were used when R gave up.
> >
> > Best,
> > Uwe Ligges
>
> For my own part, looking at the OP's question, it does seem curious
> that R could write that .RData file, but on the same system not be able
> to reload something it created.  How would that work.  Wouldn't the
> memory limit have been exceeded BEFORE the the .RData file was written
> the FIRST time?
>
> JDougherty
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why does a 2 GB RData file exceed my 16GB memory limit when reading it in?

David Jones
In reply to this post by Uwe Ligges-3
Thank you Uwe, John, and Bert - this is very helpful context.

If it helps inform the discussion, to address John and Bert's
questions - I actually had less memory free when I originally ran the
analyses and saved the workspace, than when I read in the data back in
later on (I rebooted in an attempt to free all possible memory before
rereading the workspace back in).



On Wed, Sep 2, 2020 at 1:27 PM John via R-help <r-help using
r-project.org> wrote:

>> On Wed, 2 Sep 2020 13:36:43 +0200
>> Uwe Ligges <ligges using statistik.tu-dortmund.de> wrote:
>>
>> > On 02.09.2020 04:44, David Jones wrote:
>> > > I ran a number of analyses in R and saved the workspace, which
>> > > resulted in a 2GB .RData file. When I try to read the file back
>> > > into R
>> >
>> > Compressed in RData but uncompressed in main memory....
>> >
>> >
>> > > later, it won't read into R and provides the error: "Error: cannot
>> > > allocate vector of size 37 Kb"
>> > >
>> > > This error comes after 1 minute of trying to read things in - I
>> > > presume a single vector sends it over the memory limit. But,
>> > > memory.limit() shows that I have access to a full 16gb of ram on my
>> > > machine (12 GB are free when I try to load the RData file).
>> >
>> > But the data may need more....
>> >
>> >
>> > > gc() shows the following after I receive this error:
>> > >
>> > > used (Mb) gc trigger (Mb) max used (Mb)
>> > > Ncells 623130 33.3 4134347 220.8 5715387 305.3
>> > > Vcells 1535682 11.8 883084810 6737.5 2100594002 16026.3
>> >
>> > So 16GB were used when R gave up.
>> >
>> > Best,
>> > Uwe Ligges
>>
>> For my own part, looking at the OP's question, it does seem curious
>> that R could write that .RData file, but on the same system not be able
>> to reload something it created.  How would that work.  Wouldn't the
>> memory limit have been exceeded BEFORE the the .RData file was written
>> the FIRST time?
>>
>> JDougherty


>R experts may give you a detailed explanation, but it is certainly possible
>that the memory available to R when it wrote the file was different than
>when it tried to read it, is it not?

>Bert Gunter

>"The trouble with having an open mind is that people keep coming along and
>sticking things into it."
>-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why does a 2 GB RData file exceed my 16GB memory limit when reading it in?

Leandro Marino-2
David,

If the ".Rdata" contains more than one object you could (and maybe should
use) the SOAR package (from Venables). This package helps you to split the
objects over multiple RData files. It's useful when you have numerous
medium-large objects in the workspace but doesn't use then at the same
time.

When use SOAR::Attach(), for instance, it loads the current name of all the
objects and retain than available in the searchpath but without load then
to the memory. As you call, they will be loaded into the memory.

If needed, you can update the object and then store it again with the
SOAR::Store()

For my use, this package is terrific! I use it with an analysis that I need
to repeat over medium-large similars datasets.

Best
Leandro

Em qua., 2 de set. de 2020 às 18:33, David Jones <[hidden email]>
escreveu:

> Thank you Uwe, John, and Bert - this is very helpful context.
>
> If it helps inform the discussion, to address John and Bert's
> questions - I actually had less memory free when I originally ran the
> analyses and saved the workspace, than when I read in the data back in
> later on (I rebooted in an attempt to free all possible memory before
> rereading the workspace back in).
>
>
>
> On Wed, Sep 2, 2020 at 1:27 PM John via R-help <r-help using
> r-project.org> wrote:
>
> >> On Wed, 2 Sep 2020 13:36:43 +0200
> >> Uwe Ligges <ligges using statistik.tu-dortmund.de> wrote:
> >>
> >> > On 02.09.2020 04:44, David Jones wrote:
> >> > > I ran a number of analyses in R and saved the workspace, which
> >> > > resulted in a 2GB .RData file. When I try to read the file back
> >> > > into R
> >> >
> >> > Compressed in RData but uncompressed in main memory....
> >> >
> >> >
> >> > > later, it won't read into R and provides the error: "Error: cannot
> >> > > allocate vector of size 37 Kb"
> >> > >
> >> > > This error comes after 1 minute of trying to read things in - I
> >> > > presume a single vector sends it over the memory limit. But,
> >> > > memory.limit() shows that I have access to a full 16gb of ram on my
> >> > > machine (12 GB are free when I try to load the RData file).
> >> >
> >> > But the data may need more....
> >> >
> >> >
> >> > > gc() shows the following after I receive this error:
> >> > >
> >> > > used (Mb) gc trigger (Mb) max used (Mb)
> >> > > Ncells 623130 33.3 4134347 220.8 5715387 305.3
> >> > > Vcells 1535682 11.8 883084810 6737.5 2100594002 16026.3
> >> >
> >> > So 16GB were used when R gave up.
> >> >
> >> > Best,
> >> > Uwe Ligges
> >>
> >> For my own part, looking at the OP's question, it does seem curious
> >> that R could write that .RData file, but on the same system not be able
> >> to reload something it created.  How would that work.  Wouldn't the
> >> memory limit have been exceeded BEFORE the the .RData file was written
> >> the FIRST time?
> >>
> >> JDougherty
>
>
> >R experts may give you a detailed explanation, but it is certainly
> possible
> >that the memory available to R when it wrote the file was different than
> >when it tried to read it, is it not?
>
> >Bert Gunter
>
> >"The trouble with having an open mind is that people keep coming along and
> >sticking things into it."
> >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why does a 2 GB RData file exceed my 16GB memory limit when reading it in?

Jeff Newmiller
In reply to this post by David Jones
You need more RAM to load this file. As the memory was being used in your original file, certain objects (such as numeric columns) were being shared among different higher-level objects (such as data frames). When serialized into the file those optimizations were lost, and now those columns are stored separately.

Search [1] for "shared" to learn more about measuring object memory requirements.

[1] http://adv-r.had.co.nz/memory.html

On September 2, 2020 2:31:53 PM PDT, David Jones <[hidden email]> wrote:

>Thank you Uwe, John, and Bert - this is very helpful context.
>
>If it helps inform the discussion, to address John and Bert's
>questions - I actually had less memory free when I originally ran the
>analyses and saved the workspace, than when I read in the data back in
>later on (I rebooted in an attempt to free all possible memory before
>rereading the workspace back in).
>
>
>
>On Wed, Sep 2, 2020 at 1:27 PM John via R-help <r-help using
>r-project.org> wrote:
>
>>> On Wed, 2 Sep 2020 13:36:43 +0200
>>> Uwe Ligges <ligges using statistik.tu-dortmund.de> wrote:
>>>
>>> > On 02.09.2020 04:44, David Jones wrote:
>>> > > I ran a number of analyses in R and saved the workspace, which
>>> > > resulted in a 2GB .RData file. When I try to read the file back
>>> > > into R
>>> >
>>> > Compressed in RData but uncompressed in main memory....
>>> >
>>> >
>>> > > later, it won't read into R and provides the error: "Error:
>cannot
>>> > > allocate vector of size 37 Kb"
>>> > >
>>> > > This error comes after 1 minute of trying to read things in - I
>>> > > presume a single vector sends it over the memory limit. But,
>>> > > memory.limit() shows that I have access to a full 16gb of ram on
>my
>>> > > machine (12 GB are free when I try to load the RData file).
>>> >
>>> > But the data may need more....
>>> >
>>> >
>>> > > gc() shows the following after I receive this error:
>>> > >
>>> > > used (Mb) gc trigger (Mb) max used (Mb)
>>> > > Ncells 623130 33.3 4134347 220.8 5715387 305.3
>>> > > Vcells 1535682 11.8 883084810 6737.5 2100594002 16026.3
>>> >
>>> > So 16GB were used when R gave up.
>>> >
>>> > Best,
>>> > Uwe Ligges
>>>
>>> For my own part, looking at the OP's question, it does seem curious
>>> that R could write that .RData file, but on the same system not be
>able
>>> to reload something it created.  How would that work.  Wouldn't the
>>> memory limit have been exceeded BEFORE the the .RData file was
>written
>>> the FIRST time?
>>>
>>> JDougherty
>
>
>>R experts may give you a detailed explanation, but it is certainly
>possible
>>that the memory available to R when it wrote the file was different
>than
>>when it tried to read it, is it not?
>
>>Bert Gunter
>
>>"The trouble with having an open mind is that people keep coming along
>and
>>sticking things into it."
>>-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why does a 2 GB RData file exceed my 16GB memory limit when reading it in?

R help mailing list-2
In reply to this post by David Jones
On Wed, 2 Sep 2020 16:31:53 -0500
David Jones <[hidden email]> wrote:

> Thank you Uwe, John, and Bert - this is very helpful context.
>
> If it helps inform the discussion, to address John and Bert's
> questions - I actually had less memory free when I originally ran the
> analyses and saved the workspace, than when I read in the data back in
> later on (I rebooted in an attempt to free all possible memory before
> rereading the workspace back in).
>
I assumed that, though I shouldn't have.  Nice to know.  Were you
working from a terminal or through a GUI like RStudio?  You will need
to provide a really clear description of the initial and later
conditions.  Your step to reboot and then load is exactly what I would
have done, I would also have killed any network connection temporarily
to see if there are other things going on that caused the problem out
side of R.  You should also let any potential helper know what OS you
are using, and what hardware configuration you have.  Since you
rebooted you are probably not working across a network, but ...

JWDougherty

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why does a 2 GB RData file exceed my 16GB memory limit when reading it in?

Ista Zahn
In reply to this post by Leandro Marino-2
On Wed, Sep 2, 2020 at 7:22 PM Leandro Marino
<[hidden email]> wrote:

>
> David,
>
> If the ".Rdata" contains more than one object you could (and maybe should
> use) the SOAR package (from Venables). This package helps you to split the
> objects over multiple RData files. It's useful when you have numerous
> medium-large objects in the workspace but doesn't use then at the same
> time.
>
> When use SOAR::Attach(), for instance, it loads the current name of all the
> objects and retain than available in the searchpath but without load then
> to the memory. As you call, they will be loaded into the memory.
>
> If needed, you can update the object and then store it again with the
> SOAR::Store()
>
> For my use, this package is terrific! I use it with an analysis that I need
> to repeat over medium-large similars datasets.
>

The qs package might also be worth a try. I don't have a specific
reason for thinking it will avoid the original problem, but in general
qs uses lots of fancy compression and memory management features.

--Ista

> Best
> Leandro
>
> Em qua., 2 de set. de 2020 às 18:33, David Jones <[hidden email]>
> escreveu:
>
> > Thank you Uwe, John, and Bert - this is very helpful context.
> >
> > If it helps inform the discussion, to address John and Bert's
> > questions - I actually had less memory free when I originally ran the
> > analyses and saved the workspace, than when I read in the data back in
> > later on (I rebooted in an attempt to free all possible memory before
> > rereading the workspace back in).
> >
> >
> >
> > On Wed, Sep 2, 2020 at 1:27 PM John via R-help <r-help using
> > r-project.org> wrote:
> >
> > >> On Wed, 2 Sep 2020 13:36:43 +0200
> > >> Uwe Ligges <ligges using statistik.tu-dortmund.de> wrote:
> > >>
> > >> > On 02.09.2020 04:44, David Jones wrote:
> > >> > > I ran a number of analyses in R and saved the workspace, which
> > >> > > resulted in a 2GB .RData file. When I try to read the file back
> > >> > > into R
> > >> >
> > >> > Compressed in RData but uncompressed in main memory....
> > >> >
> > >> >
> > >> > > later, it won't read into R and provides the error: "Error: cannot
> > >> > > allocate vector of size 37 Kb"
> > >> > >
> > >> > > This error comes after 1 minute of trying to read things in - I
> > >> > > presume a single vector sends it over the memory limit. But,
> > >> > > memory.limit() shows that I have access to a full 16gb of ram on my
> > >> > > machine (12 GB are free when I try to load the RData file).
> > >> >
> > >> > But the data may need more....
> > >> >
> > >> >
> > >> > > gc() shows the following after I receive this error:
> > >> > >
> > >> > > used (Mb) gc trigger (Mb) max used (Mb)
> > >> > > Ncells 623130 33.3 4134347 220.8 5715387 305.3
> > >> > > Vcells 1535682 11.8 883084810 6737.5 2100594002 16026.3
> > >> >
> > >> > So 16GB were used when R gave up.
> > >> >
> > >> > Best,
> > >> > Uwe Ligges
> > >>
> > >> For my own part, looking at the OP's question, it does seem curious
> > >> that R could write that .RData file, but on the same system not be able
> > >> to reload something it created.  How would that work.  Wouldn't the
> > >> memory limit have been exceeded BEFORE the the .RData file was written
> > >> the FIRST time?
> > >>
> > >> JDougherty
> >
> >
> > >R experts may give you a detailed explanation, but it is certainly
> > possible
> > >that the memory available to R when it wrote the file was different than
> > >when it tried to read it, is it not?
> >
> > >Bert Gunter
> >
> > >"The trouble with having an open mind is that people keep coming along and
> > >sticking things into it."
> > >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.