Save creates huge files, dump doesn't

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Save creates huge files, dump doesn't

Lars Velten
Dear list,
I noticed an extremely odd behavior... I have a rather complex shiny app which allows the user to store his/her state which internally obviously triggers as call to save as follows
save(list=c("plots","gates","populations","cg", "genelists","colorscores",  "proj", "actds"),file=fname)
this was all working fine until some time ago (?!?) files created by this command became several hundred MBs big... even thought the cumulative size of all objects in memory after load() is in the 10s of kB.
Changing to
dump(list=c("plots","gates","populations","cg", "genelists","colorscores",  "proj", "actds"),file=fname)
solved the problem, output was then only 10s of kB.
(Why/when) is this behavior intended?
Best wishes,
Lars

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Save creates huge files, dump doesn't

Doran, Harold
Lars

Typically answers regarding shiny are not answered here (I do wish there was a SIG, however). With that said, are you saying that the object size is increasing for objects that previously were much smaller? Or, are other things now being saved into the workspace that were not there previously?

In other words, suppose you have some object called tmp. If you do object.size(tmp) on your older version and on the newer version are they the same size? Are the objects of the same class?



-----Original Message-----
From: R-help <[hidden email]> On Behalf Of Lars Velten
Sent: Monday, February 18, 2019 2:51 PM
To: [hidden email]
Subject: [R] Save creates huge files, dump doesn't

Dear list,
I noticed an extremely odd behavior... I have a rather complex shiny app which allows the user to store his/her state which internally obviously triggers as call to save as follows save(list=c("plots","gates","populations","cg", "genelists","colorscores",  "proj", "actds"),file=fname) this was all working fine until some time ago (?!?) files created by this command became several hundred MBs big... even thought the cumulative size of all objects in memory after load() is in the 10s of kB.
Changing to
dump(list=c("plots","gates","populations","cg", "genelists","colorscores",  "proj", "actds"),file=fname) solved the problem, output was then only 10s of kB.
(Why/when) is this behavior intended?
Best wishes,
Lars

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Save creates huge files, dump doesn't

Jeff Newmiller
In reply to this post by Lars Velten
Make a reproducible example that focuses on the save/load aspect of the size problem. You may need to experiment with which variables need to be in the save file in order to trigger the behavior. Your example might have to involve sending us a link to a large file, but that size may dissuade busy experts from tackling it so paring it down by experimentation could be in your best interest.

There is some expected behavior that can lead to larger files than the original in-memory data, but offhand I am unaware of any explanation for those files then using less space when re-loaded into memory than they occupy on disk.

On February 18, 2019 11:51:11 AM PST, Lars Velten <[hidden email]> wrote:

>Dear list,
>I noticed an extremely odd behavior... I have a rather complex shiny
>app which allows the user to store his/her state which internally
>obviously triggers as call to save as follows
>save(list=c("plots","gates","populations","cg",
>"genelists","colorscores",  "proj", "actds"),file=fname)
>this was all working fine until some time ago (?!?) files created by
>this command became several hundred MBs big... even thought the
>cumulative size of all objects in memory after load() is in the 10s of
>kB.
>Changing to
>dump(list=c("plots","gates","populations","cg",
>"genelists","colorscores",  "proj", "actds"),file=fname)
>solved the problem, output was then only 10s of kB.
>(Why/when) is this behavior intended?
>Best wishes,
>Lars
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Save creates huge files, dump doesn't

R help mailing list-2
One reason save() makes bigger files than dump() is that save() saves
environments associated with functions that are saved and those
environments may contain large datasets that are not really needed.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Tue, Feb 19, 2019 at 11:59 AM Jeff Newmiller <[hidden email]>
wrote:

> Make a reproducible example that focuses on the save/load aspect of the
> size problem. You may need to experiment with which variables need to be in
> the save file in order to trigger the behavior. Your example might have to
> involve sending us a link to a large file, but that size may dissuade busy
> experts from tackling it so paring it down by experimentation could be in
> your best interest.
>
> There is some expected behavior that can lead to larger files than the
> original in-memory data, but offhand I am unaware of any explanation for
> those files then using less space when re-loaded into memory than they
> occupy on disk.
>
> On February 18, 2019 11:51:11 AM PST, Lars Velten <[hidden email]>
> wrote:
> >Dear list,
> >I noticed an extremely odd behavior... I have a rather complex shiny
> >app which allows the user to store his/her state which internally
> >obviously triggers as call to save as follows
> >save(list=c("plots","gates","populations","cg",
> >"genelists","colorscores",  "proj", "actds"),file=fname)
> >this was all working fine until some time ago (?!?) files created by
> >this command became several hundred MBs big... even thought the
> >cumulative size of all objects in memory after load() is in the 10s of
> >kB.
> >Changing to
> >dump(list=c("plots","gates","populations","cg",
> >"genelists","colorscores",  "proj", "actds"),file=fname)
> >solved the problem, output was then only 10s of kB.
> >(Why/when) is this behavior intended?
> >Best wishes,
> >Lars
> >
> >       [[alternative HTML version deleted]]
> >
> >______________________________________________
> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Save creates huge files, dump doesn't

Lars Velten
Dear Bill, dear all,

yes that seems to be it.  The problem orginates from objects of class transformMap from package flowCore

> object_size(object@transforms@transforms$PC1.all@f)
174 MB
> object.size(object@transforms@transforms$PC1.all@f)
1160 bytes

object@transforms@transforms$PC1.all@f

function(x) x
<environment: 0x3314db8>

Do you know how to 'see' what's in 0x3314db8 ? Might then drop a line to flowCore's developer, this behavior cannot be intended - especially here where f literally is just identity :-)

Best wishes,

Lars

> On 19. Feb 2019, at 21:30, William Dunlap <[hidden email]> wrote:
>
> One reason save() makes bigger files than dump() is that save() saves environments associated with functions that are saved and those environments may contain large datasets that are not really needed.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com <http://tibco.com/>
>
> On Tue, Feb 19, 2019 at 11:59 AM Jeff Newmiller <[hidden email] <mailto:[hidden email]>> wrote:
> Make a reproducible example that focuses on the save/load aspect of the size problem. You may need to experiment with which variables need to be in the save file in order to trigger the behavior. Your example might have to involve sending us a link to a large file, but that size may dissuade busy experts from tackling it so paring it down by experimentation could be in your best interest.
>
> There is some expected behavior that can lead to larger files than the original in-memory data, but offhand I am unaware of any explanation for those files then using less space when re-loaded into memory than they occupy on disk.
>
> On February 18, 2019 11:51:11 AM PST, Lars Velten <[hidden email] <mailto:[hidden email]>> wrote:
> >Dear list,
> >I noticed an extremely odd behavior... I have a rather complex shiny
> >app which allows the user to store his/her state which internally
> >obviously triggers as call to save as follows
> >save(list=c("plots","gates","populations","cg",
> >"genelists","colorscores",  "proj", "actds"),file=fname)
> >this was all working fine until some time ago (?!?) files created by
> >this command became several hundred MBs big... even thought the
> >cumulative size of all objects in memory after load() is in the 10s of
> >kB.
> >Changing to
> >dump(list=c("plots","gates","populations","cg",
> >"genelists","colorscores",  "proj", "actds"),file=fname)
> >solved the problem, output was then only 10s of kB.
> >(Why/when) is this behavior intended?
> >Best wishes,
> >Lars
> >
> >       [[alternative HTML version deleted]]
> >
> >______________________________________________
> >[hidden email] <mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help>
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html <http://www.r-project.org/posting-guide.html>
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> ______________________________________________
> [hidden email] <mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Save creates huge files, dump doesn't

R help mailing list-2
object@transforms@transforms$PC1.all@f
function(x) x
<environment: 0x3314db8>
Do you know how to 'see' what's in 0x3314db8 ?

ls.str(all=TRUE, environment(object@transforms@transforms$PC1.all@f)

will list the names, types, summaries, etc. of the objects in that
environment.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Feb 20, 2019 at 12:20 AM Lars Velten <[hidden email]> wrote:
>
> Dear Bill, dear all,
>
> yes that seems to be it.  The problem orginates from objects of class
transformMap from package flowCore

>
> > object_size(object@transforms@transforms$PC1.all@f)
> 174 MB
> > object.size(object@transforms@transforms$PC1.all@f)
> 1160 bytes
>
> object@transforms@transforms$PC1.all@f
>
> function(x) x
> <environment: 0x3314db8>
>
> Do you know how to 'see' what's in 0x3314db8 ? Might then drop a line to
flowCore's developer, this behavior cannot be intended - especially here
where f literally is just identity :-)
>
> Best wishes,
>
> Lars
>
> On 19. Feb 2019, at 21:30, William Dunlap <[hidden email]> wrote:
>
> One reason save() makes bigger files than dump() is that save() saves
environments associated with functions that are saved and those
environments may contain large datasets that are not really needed.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Tue, Feb 19, 2019 at 11:59 AM Jeff Newmiller <[hidden email]>
wrote:
>>
>> Make a reproducible example that focuses on the save/load aspect of the
size problem. You may need to experiment with which variables need to be in
the save file in order to trigger the behavior. Your example might have to
involve sending us a link to a large file, but that size may dissuade busy
experts from tackling it so paring it down by experimentation could be in
your best interest.
>>
>> There is some expected behavior that can lead to larger files than the
original in-memory data, but offhand I am unaware of any explanation for
those files then using less space when re-loaded into memory than they
occupy on disk.
>>
>> On February 18, 2019 11:51:11 AM PST, Lars Velten <[hidden email]>
wrote:

>> >Dear list,
>> >I noticed an extremely odd behavior... I have a rather complex shiny
>> >app which allows the user to store his/her state which internally
>> >obviously triggers as call to save as follows
>> >save(list=c("plots","gates","populations","cg",
>> >"genelists","colorscores",  "proj", "actds"),file=fname)
>> >this was all working fine until some time ago (?!?) files created by
>> >this command became several hundred MBs big... even thought the
>> >cumulative size of all objects in memory after load() is in the 10s of
>> >kB.
>> >Changing to
>> >dump(list=c("plots","gates","populations","cg",
>> >"genelists","colorscores",  "proj", "actds"),file=fname)
>> >solved the problem, output was then only 10s of kB.
>> >(Why/when) is this behavior intended?
>> >Best wishes,
>> >Lars
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> >______________________________________________
>> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Save creates huge files, dump doesn't

R help mailing list-2
Also, note that the function
   function(x) x
   <environment: 0x3314db8>
has no free variables so it doesn't matter what environment encloses it.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Feb 20, 2019 at 7:47 AM William Dunlap <[hidden email]> wrote:

> object@transforms@transforms$PC1.all@f
> function(x) x
> <environment: 0x3314db8>
> Do you know how to 'see' what's in 0x3314db8 ?
>
> ls.str(all=TRUE, environment(object@transforms@transforms$PC1.all@f)
>
> will list the names, types, summaries, etc. of the objects in that
> environment.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Wed, Feb 20, 2019 at 12:20 AM Lars Velten <[hidden email]> wrote:
> >
> > Dear Bill, dear all,
> >
> > yes that seems to be it.  The problem orginates from objects of class
> transformMap from package flowCore
> >
> > > object_size(object@transforms@transforms$PC1.all@f)
> > 174 MB
> > > object.size(object@transforms@transforms$PC1.all@f)
> > 1160 bytes
> >
> > object@transforms@transforms$PC1.all@f
> >
> > function(x) x
> > <environment: 0x3314db8>
> >
> > Do you know how to 'see' what's in 0x3314db8 ? Might then drop a line to
> flowCore's developer, this behavior cannot be intended - especially here
> where f literally is just identity :-)
> >
> > Best wishes,
> >
> > Lars
> >
> > On 19. Feb 2019, at 21:30, William Dunlap <[hidden email]> wrote:
> >
> > One reason save() makes bigger files than dump() is that save() saves
> environments associated with functions that are saved and those
> environments may contain large datasets that are not really needed.
> >
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
> >
> >
> > On Tue, Feb 19, 2019 at 11:59 AM Jeff Newmiller <
> [hidden email]> wrote:
> >>
> >> Make a reproducible example that focuses on the save/load aspect of the
> size problem. You may need to experiment with which variables need to be in
> the save file in order to trigger the behavior. Your example might have to
> involve sending us a link to a large file, but that size may dissuade busy
> experts from tackling it so paring it down by experimentation could be in
> your best interest.
> >>
> >> There is some expected behavior that can lead to larger files than the
> original in-memory data, but offhand I am unaware of any explanation for
> those files then using less space when re-loaded into memory than they
> occupy on disk.
> >>
> >> On February 18, 2019 11:51:11 AM PST, Lars Velten <[hidden email]>
> wrote:
> >> >Dear list,
> >> >I noticed an extremely odd behavior... I have a rather complex shiny
> >> >app which allows the user to store his/her state which internally
> >> >obviously triggers as call to save as follows
> >> >save(list=c("plots","gates","populations","cg",
> >> >"genelists","colorscores",  "proj", "actds"),file=fname)
> >> >this was all working fine until some time ago (?!?) files created by
> >> >this command became several hundred MBs big... even thought the
> >> >cumulative size of all objects in memory after load() is in the 10s of
> >> >kB.
> >> >Changing to
> >> >dump(list=c("plots","gates","populations","cg",
> >> >"genelists","colorscores",  "proj", "actds"),file=fname)
> >> >solved the problem, output was then only 10s of kB.
> >> >(Why/when) is this behavior intended?
> >> >Best wishes,
> >> >Lars
> >> >
> >> >       [[alternative HTML version deleted]]
> >> >
> >> >______________________________________________
> >> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> >https://stat.ethz.ch/mailman/listinfo/r-help
> >> >PLEASE do read the posting guide
> >> >http://www.R-project.org/posting-guide.html
> >> >and provide commented, minimal, self-contained, reproducible code.
> >>
> >> --
> >> Sent from my phone. Please excuse my brevity.
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.