I can't figure out how to save functions to RDS file. Here is an example
what I am trying to achieve: > t <- rnorm(100) > cdf <- ecdf(t) > cdf(0) [1] 0.59 > saveRDS(cdf, "/tmp/foo") > Save workspace image? [y/n/c]: n [gtrojan@asok petproject]$ R > cdf <- readRDS("/tmp/foo") > cdf Empirical CDF Call: ecdf(t) x[1:100] = -2.8881, -2.2054, -2.0026, ..., 2.0367, 2.0414 This works. However when instead of saving cdf() I try to save function > trans <- function(x) qnorm(cdf(x) * 0.99) after restoring object from file I get an error: > trans <- readRDS("/tmp/foo") > trans(0) Error in qnorm(cdf(x) * 0.99) : could not find function "cdf" I tried to define and call cdf within the definition of trans, without success: > tmp <- rnorm(100) > trans <- function(x) { cdf <- ecdf(tmp); cdf(0); qnorm(cdf(x)) * 0.99 } > saveRDS(trans, "/tmp/foo") Save workspace image? [y/n/c]: n > trans <- readRDS("/tmp/foo") > trans function(x) { cdf <- ecdf(tmp); cdf(0); qnorm(cdf(x)) * 0.99 } > trans(0) Error in sort(x) : object 'tmp' not found So, here the call cdf(0) did not force evaluation of my random sample. What am I missing? George
It worked fine for me:
> t <- rnorm(100) > cdf <- ecdf(t) > > trans <- function(x) qnorm(cdf(x) * 0.99) > saveRDS(trans, "/tmp/foo") > trans(1.2) [1] 1.042457 > trans1 <- readRDS("/tmp/foo") > trans1(0) [1] 0.1117773 Of course, if I remove cdf() from the global environment, it will fail: > rm(cdf) > trans1(0) Error in qnorm(cdf(x) * 0.99) : could not find function "cdf" So it looks like you're clearing you global workspace in between saving and loading? You may need to read up on function closures/lexical scoping : A user-defined function in R includes not only code but also a pointer to the environment in which it was defined, in your case, the global environment from which you apparently removed cdf(). Note that functions are not evauated until called, so free variables in the functions that do not or will not exist in the function's lexical scope when called will not trigger any errors until the function *is* called. Same comments for your second version -- if tmp is removed the function will fail. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
So doesn't the fact that a function contains a reference to an environment suggest that this whole exercise is a really bad idea?
-- Sent from my phone. Please excuse my brevity.
Jeff:
Oh yes!-- and I meant to say so and forgot, so I'm glad you did. Not only might the free variable in the function not be there; worse yet, it might be there but something else. So it seems like a disaster waiting to happen. The solution, I would presume, is to have no free variables (make them arguments). Or save and read the function *and* its environment. Namespaces in packages I think would also take care of this, right? Note: If my understanding on any of this is incorrect, I would greatly appreciate someone settting me straight. In particular, as Jeff noted, my understanding is that saving a function (closure) with a free variable in the function depends on the function finding its enclosing environment when it is read back into R via readRDS() . Correct? The man page is silent on this point. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On February 12, 2017 4:05:31 PM PST, Bert Gunter wrote: >>It worked fine for me: >> >>> t <- rnorm(100) >>> cdf <- ecdf(t) >>> >>> trans <- function(x) qnorm(cdf(x) * 0.99) >>> saveRDS(trans, "/tmp/foo") >>> trans(1.2) >>[1] 1.042457 >>> trans1 <- readRDS("/tmp/foo") >>> trans1(0) >>[1] 0.1117773 >> >> >>Of course, if I remove cdf() from the global environment, it will fail: >> >>> rm(cdf) >>> trans1(0) >>Error in qnorm(cdf(x) * 0.99) : could not find function "cdf" >> >>So it looks like you're clearing you global workspace in between >>saving and loading? >> >>You may need to read up on function closures/lexical scoping : A >>user-defined function in R includes not only code but also a pointer >>to the environment in which it was defined, in your case, the global >>environment from which you apparently removed cdf(). Note that >>functions are not evauated until called, so free variables in the >>functions that do not or will not exist in the function's lexical >>scope when called will not trigger any errors until the function *is* >>called. >> >>Same comments for your second version -- if tmp is removed the >>function will fail. [y/n/c]: n >>> [gtrojan@asok petproject]$ R >>>> cdf <- readRDS("/tmp/foo") >>>> cdf >>> Empirical CDF >>> Call: ecdf(t) >>> x[1:100] = -2.8881, -2.2054, -2.0026, ..., 2.0367, 2.0414 >>> >>> This works. However when instead of saving cdf() I try to save >>function >>> >>>> trans <- function(x) qnorm(cdf(x) * 0.99) >>> >>> after restoring object from file I get an error: >>> >>>> trans <- readRDS("/tmp/foo") >>>> trans(0) >>> Error in qnorm(cdf(x) * 0.99) : could not find function "cdf" >>> >>> I tried to define and call cdf within the definition of trans, >>without >>> success: >>> >>>> tmp <- rnorm(100) >>>> trans <- function(x) { cdf <- ecdf(tmp); cdf(0); qnorm(cdf(x)) * >>0.99 } >>>> saveRDS(trans, "/tmp/foo") >>> Save workspace image? [y/n/c]: n >>> >>>> trans <- readRDS("/tmp/foo") >>>> trans >>> function(x) { cdf <- ecdf(tmp); cdf(0); qnorm(cdf(x)) * 0.99 } >>>> trans(0) >>> Error in sort(x) : object 'tmp' not found >>> >>> So, here the call cdf(0) did not force evaluation of my random >>sample. What >>> am I missing? >>> >>> George What >>> am I missing? >>> >>> George >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >>______________________________________________ >>[hidden email] mailing list -- To UNSUBSCRIBE and more, see >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide >>http://www.R-project.org/posting-guide.html >>and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Bert Gunter-2
I want to split my computation into parts. The first script processes the
data, the second does the graphics. I want to save results of time-consuming calculations. My example tried to simulate this by terminate the session without saving it, so the environment was lost on purpose. What confuses me that ecdf can be saved and restored, but not my own derived function. Of course I can save parameters and redefine the function in the second script. Reading Chapter 8 of Advanced R, hopefully the book will clear my mind.
ecdf() is part of the stats package, which is (typically)
automatically attached on startup. I have no idea what you mean by "splitting" and "saving." This is basically how all of R works -- e.g. see the value of lm() and the (S3) plot method, plot.lm, for "lm" objects. This has nothing to do with free variables and lexical scoping. Perhaps you need to review how functions and S3 methods work? Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
