Objects not gc'ed due to caching (?) in R's S3 dispatch mechanism

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Objects not gc'ed due to caching (?) in R's S3 dispatch mechanism

Iñaki Úcar
Hi,

I initially opened an issue in the R6 repo because my issue was with
an R6 object. But Winston (thanks!) further simplified my example, and
it turns out that the issue (whether a feature or a bug is yet to be
seen) had to do with S3 dispatching.

The following example, by Winston, depicts the issue:

print.foo <- function(x, ...) {
  cat("print.foo called\n")
  invisible(x)
}

new_foo <- function() {
  e <- new.env()
  reg.finalizer(e, function(e) message("Finalizer called"))
  class(e) <- "foo"
  e
}

new_foo()
gc() # still in .Last.value
gc() # nothing

I would expect that the second call to gc() should free 'e', but it's
not. However, if we call now *any* S3 method, then the object can be
finally gc'ed:

print(1)
gc() # Finalizer called

So the hypothesis is that there is some kind of caching (?) mechanism
going on. Intended behaviour or not, this is something that was
introduced between R 3.2.3 and 3.3.2 (the first succeeds; from the
second on, the example fails as described above).

Regards,
Iñaki

PS: Further discussion and examples in https://github.com/r-lib/R6/issues/140

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Objects not gc'ed due to caching (?) in R's S3 dispatch mechanism

Winston Chang
I'd like to emphasize that although Iñaki's example uses print(), it
also happens with other S3 generics. Please note that each of the
following examples might need to be run in a clean R session to work.

===========
Here's an example that doesn't use S3 dispatch. The finalizer runs correctly.

ident <- function(x) invisible(x)

env_with_finalizer <- function() {
  reg.finalizer(environment(), function(e) message("Finalizer called"))
  environment()
}

ident(env_with_finalizer())
gc() # Still in .Last.value
gc() # Finalizer called


===========
Here's an example that uses S3. In this case, the finalizer doesn't run.

ident <- function(x) UseMethod("ident")
ident.default <- function(x) invisible(x)

env_with_finalizer <- function() {
  reg.finalizer(environment(), function(e) message("Finalizer called"))
  environment()
}

ident(env_with_finalizer())
gc()
gc() # Nothing

However, if the S3 generic is called with another object, the
finalizer will run on the next GC:

ident(1)
gc() # Finalizer called

===========

This example is the same as the previous one, except that, at the end,
instead of calling the same S3 generic on a different object (that is,
ident(1)), it calls a _different_ S3 generic on a different object
(mean(1)).

ident <- function(x) UseMethod("ident")
ident.default <- function(x) invisible(x)

env_with_finalizer <- function() {
  reg.finalizer(environment(), function(e) message("Finalizer called"))
  environment()
}

ident(env_with_finalizer())
gc()
gc() # Nothing

# Call a different S3 generic
mean(1)
gc() # Finalizer called


-Winston

On Mon, Mar 26, 2018 at 4:46 PM, Iñaki Úcar <[hidden email]> wrote:

> Hi,
>
> I initially opened an issue in the R6 repo because my issue was with
> an R6 object. But Winston (thanks!) further simplified my example, and
> it turns out that the issue (whether a feature or a bug is yet to be
> seen) had to do with S3 dispatching.
>
> The following example, by Winston, depicts the issue:
>
> print.foo <- function(x, ...) {
>   cat("print.foo called\n")
>   invisible(x)
> }
>
> new_foo <- function() {
>   e <- new.env()
>   reg.finalizer(e, function(e) message("Finalizer called"))
>   class(e) <- "foo"
>   e
> }
>
> new_foo()
> gc() # still in .Last.value
> gc() # nothing
>
> I would expect that the second call to gc() should free 'e', but it's
> not. However, if we call now *any* S3 method, then the object can be
> finally gc'ed:
>
> print(1)
> gc() # Finalizer called
>
> So the hypothesis is that there is some kind of caching (?) mechanism
> going on. Intended behaviour or not, this is something that was
> introduced between R 3.2.3 and 3.3.2 (the first succeeds; from the
> second on, the example fails as described above).
>
> Regards,
> Iñaki
>
> PS: Further discussion and examples in https://github.com/r-lib/R6/issues/140
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Objects not gc'ed due to caching (?) in R's S3 dispatch mechanism

luke-tierney
In reply to this post by Iñaki Úcar
This has nothing to do with printing or dispatch per se. It is the
result of an internal register (R_ReturnedValue) being protected. It
gets rewritten whenever there is a jump, e.g. by an explicit return
call. So a simplified example is

new_foo <- function() {
   e <- new.env()
     reg.finalizer(e, function(e) message("Finalizer called"))
       e
       }

bar <- function(x) return(x)

bar(new_foo())
gc() # still in .Last.value
gc() # nothing

UseMethod essentially does a return call so you see the effect there.

The R_ReturnedValue register could probably be safely cleared in more
places but it isn't clear exactly where. As things stand it will be
cleared on the next use of a non-local transfer of control, and those
happen frequently enough that I'm not convinced this is worth
addressing, at least not at this point in the release cycle.

Best,

luke

On Mon, 26 Mar 2018, Iñaki Úcar wrote:

> Hi,
>
> I initially opened an issue in the R6 repo because my issue was with
> an R6 object. But Winston (thanks!) further simplified my example, and
> it turns out that the issue (whether a feature or a bug is yet to be
> seen) had to do with S3 dispatching.
>
> The following example, by Winston, depicts the issue:
>
> print.foo <- function(x, ...) {
>  cat("print.foo called\n")
>  invisible(x)
> }
>
> new_foo <- function() {
>  e <- new.env()
>  reg.finalizer(e, function(e) message("Finalizer called"))
>  class(e) <- "foo"
>  e
> }
>
> new_foo()
> gc() # still in .Last.value
> gc() # nothing
>
> I would expect that the second call to gc() should free 'e', but it's
> not. However, if we call now *any* S3 method, then the object can be
> finally gc'ed:
>
> print(1)
> gc() # Finalizer called
>
> So the hypothesis is that there is some kind of caching (?) mechanism
> going on. Intended behaviour or not, this is something that was
> introduced between R 3.2.3 and 3.3.2 (the first succeeds; from the
> second on, the example fails as described above).
>
> Regards,
> Iñaki
>
> PS: Further discussion and examples in https://github.com/r-lib/R6/issues/140
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   [hidden email]
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Objects not gc'ed due to caching (?) in R's S3 dispatch mechanism

Iñaki Úcar
2018-03-27 6:02 GMT+02:00  <[hidden email]>:

> This has nothing to do with printing or dispatch per se. It is the
> result of an internal register (R_ReturnedValue) being protected. It
> gets rewritten whenever there is a jump, e.g. by an explicit return
> call. So a simplified example is
>
> new_foo <- function() {
>   e <- new.env()
>     reg.finalizer(e, function(e) message("Finalizer called"))
>       e
>       }
>
> bar <- function(x) return(x)
>
> bar(new_foo())
> gc() # still in .Last.value
> gc() # nothing
>
> UseMethod essentially does a return call so you see the effect there.

Understood. Thanks for the explanation, Luke.

> The R_ReturnedValue register could probably be safely cleared in more
> places but it isn't clear exactly where. As things stand it will be
> cleared on the next use of a non-local transfer of control, and those
> happen frequently enough that I'm not convinced this is worth
> addressing, at least not at this point in the release cycle.

I barely know the R internals, and I'm sure there's a good reason
behind this change (R 3.2.3 does not show this behaviour), but IMHO
it's, at the very least, confusing. When .Last.value is cleared, that
object loses the last reference, and I'd expect it to be eligible for
gc.

In my case, I was using an object that internally generates a bunch of
data. I discovered this because I was benchmarking the execution, and
I was running out of memory because the memory wasn't been freed as it
was supposed to. So I spent half of the day on this because I thought
I had a memory leak. :-\ (Not blaming anyone here, of course; just
making a case to show that this may be worth addressing at some
point). :-)

Regards,
Iñaki

>
> Best,
>
> luke
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Objects not gc'ed due to caching (?) in R's S3 dispatch mechanism

Tomas Kalibera
On 03/27/2018 09:51 AM, Iñaki Úcar wrote:

> 2018-03-27 6:02 GMT+02:00  <[hidden email]>:
>> This has nothing to do with printing or dispatch per se. It is the
>> result of an internal register (R_ReturnedValue) being protected. It
>> gets rewritten whenever there is a jump, e.g. by an explicit return
>> call. So a simplified example is
>>
>> new_foo <- function() {
>>    e <- new.env()
>>      reg.finalizer(e, function(e) message("Finalizer called"))
>>        e
>>        }
>>
>> bar <- function(x) return(x)
>>
>> bar(new_foo())
>> gc() # still in .Last.value
>> gc() # nothing
>>
>> UseMethod essentially does a return call so you see the effect there.
> Understood. Thanks for the explanation, Luke.
>
>> The R_ReturnedValue register could probably be safely cleared in more
>> places but it isn't clear exactly where. As things stand it will be
>> cleared on the next use of a non-local transfer of control, and those
>> happen frequently enough that I'm not convinced this is worth
>> addressing, at least not at this point in the release cycle.
> I barely know the R internals, and I'm sure there's a good reason
> behind this change (R 3.2.3 does not show this behaviour), but IMHO
> it's, at the very least, confusing. When .Last.value is cleared, that
> object loses the last reference, and I'd expect it to be eligible for
> gc.
>
> In my case, I was using an object that internally generates a bunch of
> data. I discovered this because I was benchmarking the execution, and
> I was running out of memory because the memory wasn't been freed as it
> was supposed to. So I spent half of the day on this because I thought
> I had a memory leak. :-\ (Not blaming anyone here, of course; just
> making a case to show that this may be worth addressing at some
> point). :-)
 From the perspective of the R user/programmer/package developer, please
do not make any assumptions on when finalizers will be run, only that
they indeed won't be run when the object is still alive. Similarly, it
is not good to make any assumptions that "gc()" will actually run a
collection (and a particular type of collection, that it will be
immediately, etc). Such guarantees would too much restrict the design
space and potential optimizations on the R internals side - and for this
reason are typically not given in other managed languages, either. I've
seen R examples where most time had been wasted tracing live objects
because explicit "gc()" had been run in a tight loop. Note in Java for
instance, an explicit call to gc() had been eventually turned into a
hint only.

Once you start debugging when objects are collected, you are debugging R
internals - and surprises/changes between svn versions/etc should be
expected as well as changes in behavior caused very indirectly by code
changes somewhere else. I work on R internals and spend most of my time
debugging - that is unfortunately normal when you work on a language
runtime. Indeed, the runtime should try not to keep references to
objects for too long, but it remains to be seen whether and for what
cost this could be fixed with R_ReturnedValue.

Best
Tomas

>
> Regards,
> Iñaki
>
>> Best,
>>
>> luke
>>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Objects not gc'ed due to caching (?) in R's S3 dispatch mechanism

Iñaki Úcar
2018-03-27 11:11 GMT+02:00 Tomas Kalibera <[hidden email]>:

> On 03/27/2018 09:51 AM, Iñaki Úcar wrote:
>>
>> 2018-03-27 6:02 GMT+02:00  <[hidden email]>:
>>>
>>> This has nothing to do with printing or dispatch per se. It is the
>>> result of an internal register (R_ReturnedValue) being protected. It
>>> gets rewritten whenever there is a jump, e.g. by an explicit return
>>> call. So a simplified example is
>>>
>>> new_foo <- function() {
>>>    e <- new.env()
>>>      reg.finalizer(e, function(e) message("Finalizer called"))
>>>        e
>>>        }
>>>
>>> bar <- function(x) return(x)
>>>
>>> bar(new_foo())
>>> gc() # still in .Last.value
>>> gc() # nothing
>>>
>>> UseMethod essentially does a return call so you see the effect there.
>>
>> Understood. Thanks for the explanation, Luke.
>>
>>> The R_ReturnedValue register could probably be safely cleared in more
>>> places but it isn't clear exactly where. As things stand it will be
>>> cleared on the next use of a non-local transfer of control, and those
>>> happen frequently enough that I'm not convinced this is worth
>>> addressing, at least not at this point in the release cycle.
>>
>> I barely know the R internals, and I'm sure there's a good reason
>> behind this change (R 3.2.3 does not show this behaviour), but IMHO
>> it's, at the very least, confusing. When .Last.value is cleared, that
>> object loses the last reference, and I'd expect it to be eligible for
>> gc.
>>
>> In my case, I was using an object that internally generates a bunch of
>> data. I discovered this because I was benchmarking the execution, and
>> I was running out of memory because the memory wasn't been freed as it
>> was supposed to. So I spent half of the day on this because I thought
>> I had a memory leak. :-\ (Not blaming anyone here, of course; just
>> making a case to show that this may be worth addressing at some
>> point). :-)
>
> From the perspective of the R user/programmer/package developer, please do
> not make any assumptions on when finalizers will be run, only that they
> indeed won't be run when the object is still alive. Similarly, it is not
> good to make any assumptions that "gc()" will actually run a collection (and
> a particular type of collection, that it will be immediately, etc). Such
> guarantees would too much restrict the design space and potential
> optimizations on the R internals side - and for this reason are typically
> not given in other managed languages, either. I've seen R examples where
> most time had been wasted tracing live objects because explicit "gc()" had
> been run in a tight loop. Note in Java for instance, an explicit call to
> gc() had been eventually turned into a hint only.
>
> Once you start debugging when objects are collected, you are debugging R
> internals - and surprises/changes between svn versions/etc should be
> expected as well as changes in behavior caused very indirectly by code
> changes somewhere else. I work on R internals and spend most of my time
> debugging - that is unfortunately normal when you work on a language
> runtime. Indeed, the runtime should try not to keep references to objects
> for too long, but it remains to be seen whether and for what cost this could
> be fixed with R_ReturnedValue.

To be precise, I was not debugging *when* objects were collected, I
was debugging *whether* objects were collected. And for that, I
necessarily need some hint about the *when*.

But I think that's another discussion. My point is that, as an R user
and package developer, I expect consistency, and currently

new_foo <- function() {
  e <- new.env()
  reg.finalizer(e, function(e) message("Finalizer called"))
  e
}

bar <- function(x) return(x)

bar(new_foo())
gc() # still in .Last.value
gc() # nothing

behaves differently than

new_foo <- function() {
  e <- new.env()
  reg.finalizer(e, function(e) message("Finalizer called"))
  e
}

bar <- function(x) x

bar(new_foo())
gc() # still in .Last.value
gc() # Finalizer called!

And such a difference is not explained (AFAIK) in the documentation.
At least the help page for 'return' does not make me think that I
should not expect exactly the same behaviour if I write (or not) an
explicit 'return'.

Regards,
Iñaki

>
> Best
> Tomas
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Objects not gc'ed due to caching (?) in R's S3 dispatch mechanism

Tomas Kalibera
On 03/27/2018 11:53 AM, Iñaki Úcar wrote:

> 2018-03-27 11:11 GMT+02:00 Tomas Kalibera <[hidden email]>:
>> On 03/27/2018 09:51 AM, Iñaki Úcar wrote:
>>> 2018-03-27 6:02 GMT+02:00  <[hidden email]>:
>>>> This has nothing to do with printing or dispatch per se. It is the
>>>> result of an internal register (R_ReturnedValue) being protected. It
>>>> gets rewritten whenever there is a jump, e.g. by an explicit return
>>>> call. So a simplified example is
>>>>
>>>> new_foo <- function() {
>>>>     e <- new.env()
>>>>       reg.finalizer(e, function(e) message("Finalizer called"))
>>>>         e
>>>>         }
>>>>
>>>> bar <- function(x) return(x)
>>>>
>>>> bar(new_foo())
>>>> gc() # still in .Last.value
>>>> gc() # nothing
>>>>
>>>> UseMethod essentially does a return call so you see the effect there.
>>> Understood. Thanks for the explanation, Luke.
>>>
>>>> The R_ReturnedValue register could probably be safely cleared in more
>>>> places but it isn't clear exactly where. As things stand it will be
>>>> cleared on the next use of a non-local transfer of control, and those
>>>> happen frequently enough that I'm not convinced this is worth
>>>> addressing, at least not at this point in the release cycle.
>>> I barely know the R internals, and I'm sure there's a good reason
>>> behind this change (R 3.2.3 does not show this behaviour), but IMHO
>>> it's, at the very least, confusing. When .Last.value is cleared, that
>>> object loses the last reference, and I'd expect it to be eligible for
>>> gc.
>>>
>>> In my case, I was using an object that internally generates a bunch of
>>> data. I discovered this because I was benchmarking the execution, and
>>> I was running out of memory because the memory wasn't been freed as it
>>> was supposed to. So I spent half of the day on this because I thought
>>> I had a memory leak. :-\ (Not blaming anyone here, of course; just
>>> making a case to show that this may be worth addressing at some
>>> point). :-)
>>  From the perspective of the R user/programmer/package developer, please do
>> not make any assumptions on when finalizers will be run, only that they
>> indeed won't be run when the object is still alive. Similarly, it is not
>> good to make any assumptions that "gc()" will actually run a collection (and
>> a particular type of collection, that it will be immediately, etc). Such
>> guarantees would too much restrict the design space and potential
>> optimizations on the R internals side - and for this reason are typically
>> not given in other managed languages, either. I've seen R examples where
>> most time had been wasted tracing live objects because explicit "gc()" had
>> been run in a tight loop. Note in Java for instance, an explicit call to
>> gc() had been eventually turned into a hint only.
>>
>> Once you start debugging when objects are collected, you are debugging R
>> internals - and surprises/changes between svn versions/etc should be
>> expected as well as changes in behavior caused very indirectly by code
>> changes somewhere else. I work on R internals and spend most of my time
>> debugging - that is unfortunately normal when you work on a language
>> runtime. Indeed, the runtime should try not to keep references to objects
>> for too long, but it remains to be seen whether and for what cost this could
>> be fixed with R_ReturnedValue.
> To be precise, I was not debugging *when* objects were collected, I
> was debugging *whether* objects were collected. And for that, I
> necessarily need some hint about the *when*.
They would be collected eventually if you were running a non-trivial
program (because there would be a jump inside).

> But I think that's another discussion. My point is that, as an R user
> and package developer, I expect consistency, and currently
>
> new_foo <- function() {
>    e <- new.env()
>    reg.finalizer(e, function(e) message("Finalizer called"))
>    e
> }
>
> bar <- function(x) return(x)
>
> bar(new_foo())
> gc() # still in .Last.value
> gc() # nothing
>
> behaves differently than
>
> new_foo <- function() {
>    e <- new.env()
>    reg.finalizer(e, function(e) message("Finalizer called"))
>    e
> }
>
> bar <- function(x) x
>
> bar(new_foo())
> gc() # still in .Last.value
> gc() # Finalizer called!
>
> And such a difference is not explained (AFAIK) in the documentation.
> At least the help page for 'return' does not make me think that I
> should not expect exactly the same behaviour if I write (or not) an
> explicit 'return'.
As R user and package developer, you should have consistency in
_documented_ behavior. If not, it is a bug and has to be fixed either in
the documentation, or in the code. You should never depend on
undocumented behavior, because that can change at any time. You cannot
expect that different versions of R would behave exactly the same, not
even the svn versions, that is not possible and would not be possible
even if we did not change any code in R implementation, because even the
OS, C compiler, hardware, and third party libraries have their specified
and unspecified behavior.

Best
Tomas
>
> Regards,
> Iñaki
>
>> Best
>> Tomas
>>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Objects not gc'ed due to caching (?) in R's S3 dispatch mechanism

luke-tierney
In reply to this post by luke-tierney
I have committed a change to R-devel that addresses this. To be on the
safe side I need to run some more extensive tests before deciding if
this can be ported to the release branch for R 3.5.0. Should know in a
day or two.

Best,

luke

On Tue, 27 Mar 2018, [hidden email] wrote:

> This has nothing to do with printing or dispatch per se. It is the
> result of an internal register (R_ReturnedValue) being protected. It
> gets rewritten whenever there is a jump, e.g. by an explicit return
> call. So a simplified example is
>
> new_foo <- function() {
>  e <- new.env()
>    reg.finalizer(e, function(e) message("Finalizer called"))
>      e
>      }
>
> bar <- function(x) return(x)
>
> bar(new_foo())
> gc() # still in .Last.value
> gc() # nothing
>
> UseMethod essentially does a return call so you see the effect there.
>
> The R_ReturnedValue register could probably be safely cleared in more
> places but it isn't clear exactly where. As things stand it will be
> cleared on the next use of a non-local transfer of control, and those
> happen frequently enough that I'm not convinced this is worth
> addressing, at least not at this point in the release cycle.
>
> Best,
>
> luke
>
> On Mon, 26 Mar 2018, Iñaki Úcar wrote:
>
>> Hi,
>>
>> I initially opened an issue in the R6 repo because my issue was with
>> an R6 object. But Winston (thanks!) further simplified my example, and
>> it turns out that the issue (whether a feature or a bug is yet to be
>> seen) had to do with S3 dispatching.
>>
>> The following example, by Winston, depicts the issue:
>>
>> print.foo <- function(x, ...) {
>>  cat("print.foo called\n")
>>  invisible(x)
>> }
>>
>> new_foo <- function() {
>>  e <- new.env()
>>  reg.finalizer(e, function(e) message("Finalizer called"))
>>  class(e) <- "foo"
>>  e
>> }
>>
>> new_foo()
>> gc() # still in .Last.value
>> gc() # nothing
>>
>> I would expect that the second call to gc() should free 'e', but it's
>> not. However, if we call now *any* S3 method, then the object can be
>> finally gc'ed:
>>
>> print(1)
>> gc() # Finalizer called
>>
>> So the hypothesis is that there is some kind of caching (?) mechanism
>> going on. Intended behaviour or not, this is something that was
>> introduced between R 3.2.3 and 3.3.2 (the first succeeds; from the
>> second on, the example fails as described above).
>>
>> Regards,
>> Iñaki
>>
>> PS: Further discussion and examples in
>> https://github.com/r-lib/R6/issues/140
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   [hidden email]
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Objects not gc'ed due to caching (?) in R's S3 dispatch mechanism

luke-tierney
Now also committed to the release branch.

Best,

luke

On Tue, 27 Mar 2018, [hidden email] wrote:

> I have committed a change to R-devel that addresses this. To be on the
> safe side I need to run some more extensive tests before deciding if
> this can be ported to the release branch for R 3.5.0. Should know in a
> day or two.
>
> Best,
>
> luke
>
> On Tue, 27 Mar 2018, [hidden email] wrote:
>
>> This has nothing to do with printing or dispatch per se. It is the
>> result of an internal register (R_ReturnedValue) being protected. It
>> gets rewritten whenever there is a jump, e.g. by an explicit return
>> call. So a simplified example is
>>
>> new_foo <- function() {
>>  e <- new.env()
>>    reg.finalizer(e, function(e) message("Finalizer called"))
>>      e
>>      }
>>
>> bar <- function(x) return(x)
>>
>> bar(new_foo())
>> gc() # still in .Last.value
>> gc() # nothing
>>
>> UseMethod essentially does a return call so you see the effect there.
>>
>> The R_ReturnedValue register could probably be safely cleared in more
>> places but it isn't clear exactly where. As things stand it will be
>> cleared on the next use of a non-local transfer of control, and those
>> happen frequently enough that I'm not convinced this is worth
>> addressing, at least not at this point in the release cycle.
>>
>> Best,
>>
>> luke
>>
>> On Mon, 26 Mar 2018, Iñaki Úcar wrote:
>>
>>> Hi,
>>>
>>> I initially opened an issue in the R6 repo because my issue was with
>>> an R6 object. But Winston (thanks!) further simplified my example, and
>>> it turns out that the issue (whether a feature or a bug is yet to be
>>> seen) had to do with S3 dispatching.
>>>
>>> The following example, by Winston, depicts the issue:
>>>
>>> print.foo <- function(x, ...) {
>>>  cat("print.foo called\n")
>>>  invisible(x)
>>> }
>>>
>>> new_foo <- function() {
>>>  e <- new.env()
>>>  reg.finalizer(e, function(e) message("Finalizer called"))
>>>  class(e) <- "foo"
>>>  e
>>> }
>>>
>>> new_foo()
>>> gc() # still in .Last.value
>>> gc() # nothing
>>>
>>> I would expect that the second call to gc() should free 'e', but it's
>>> not. However, if we call now *any* S3 method, then the object can be
>>> finally gc'ed:
>>>
>>> print(1)
>>> gc() # Finalizer called
>>>
>>> So the hypothesis is that there is some kind of caching (?) mechanism
>>> going on. Intended behaviour or not, this is something that was
>>> introduced between R 3.2.3 and 3.3.2 (the first succeeds; from the
>>> second on, the example fails as described above).
>>>
>>> Regards,
>>> Iñaki
>>>
>>> PS: Further discussion and examples in
>>> https://github.com/r-lib/R6/issues/140
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>
>
>

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   [hidden email]
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Objects not gc'ed due to caching (?) in R's S3 dispatch mechanism

luke-tierney
In reply to this post by luke-tierney
Now also committed to the release branch.

Best,

luke

On Tue, 27 Mar 2018, [hidden email] wrote:

> I have committed a change to R-devel that addresses this. To be on the
> safe side I need to run some more extensive tests before deciding if
> this can be ported to the release branch for R 3.5.0. Should know in a
> day or two.
>
> Best,
>
> luke
>
> On Tue, 27 Mar 2018, [hidden email] wrote:
>
>> This has nothing to do with printing or dispatch per se. It is the
>> result of an internal register (R_ReturnedValue) being protected. It
>> gets rewritten whenever there is a jump, e.g. by an explicit return
>> call. So a simplified example is
>>
>> new_foo <- function() {
>>  e <- new.env()
>>    reg.finalizer(e, function(e) message("Finalizer called"))
>>      e
>>      }
>>
>> bar <- function(x) return(x)
>>
>> bar(new_foo())
>> gc() # still in .Last.value
>> gc() # nothing
>>
>> UseMethod essentially does a return call so you see the effect there.
>>
>> The R_ReturnedValue register could probably be safely cleared in more
>> places but it isn't clear exactly where. As things stand it will be
>> cleared on the next use of a non-local transfer of control, and those
>> happen frequently enough that I'm not convinced this is worth
>> addressing, at least not at this point in the release cycle.
>>
>> Best,
>>
>> luke
>>
>> On Mon, 26 Mar 2018, Iñaki Úcar wrote:
>>
>>> Hi,
>>>
>>> I initially opened an issue in the R6 repo because my issue was with
>>> an R6 object. But Winston (thanks!) further simplified my example, and
>>> it turns out that the issue (whether a feature or a bug is yet to be
>>> seen) had to do with S3 dispatching.
>>>
>>> The following example, by Winston, depicts the issue:
>>>
>>> print.foo <- function(x, ...) {
>>>  cat("print.foo called\n")
>>>  invisible(x)
>>> }
>>>
>>> new_foo <- function() {
>>>  e <- new.env()
>>>  reg.finalizer(e, function(e) message("Finalizer called"))
>>>  class(e) <- "foo"
>>>  e
>>> }
>>>
>>> new_foo()
>>> gc() # still in .Last.value
>>> gc() # nothing
>>>
>>> I would expect that the second call to gc() should free 'e', but it's
>>> not. However, if we call now *any* S3 method, then the object can be
>>> finally gc'ed:
>>>
>>> print(1)
>>> gc() # Finalizer called
>>>
>>> So the hypothesis is that there is some kind of caching (?) mechanism
>>> going on. Intended behaviour or not, this is something that was
>>> introduced between R 3.2.3 and 3.3.2 (the first succeeds; from the
>>> second on, the example fails as described above).
>>>
>>> Regards,
>>> Iñaki
>>>
>>> PS: Further discussion and examples in
>>> https://github.com/r-lib/R6/issues/140
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>
>
>

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   [hidden email]
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel