S4 vs Reference Classes

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

S4 vs Reference Classes

Joseph Park

   Hi, I'm looking for some guidance on whether to use
   S4 or Reference Classes for an analysis application
   I'm developing.
   I'm a C++/Python developer, and like to 'think' in OOD.
   I started my app with S4, thinking that was the best
   set of OO features in R. However, it appears that one
   needs Reference Classes to allow object methods to assign
   values (other than the .Object in the initialize method)
   to slots of the object.
   This is typically what I prefer: creating an object, then
   operating on the object (reference) calling object methods
   to access/modify slots.
   So I'm wondering what (dis)advantages there are in
   developing with S4 vs Reference Classes.
   Things of interest:
   Performance (i.e. memory management)
   Integration compatibility with R packages
   ??? other issues
   Thanks!
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: S4 vs Reference Classes

Steve Lianoglou-6
Hi,

On Tue, Sep 13, 2011 at 1:54 PM, Joseph Park <[hidden email]> wrote:

>
>   Hi, I'm looking for some guidance on whether to use
>   S4 or Reference Classes for an analysis application
>   I'm developing.
>   I'm a C++/Python developer, and like to 'think' in OOD.
>   I started my app with S4, thinking that was the best
>   set of OO features in R. However, it appears that one
>   needs Reference Classes to allow object methods to assign
>   values (other than the .Object in the initialize method)
>   to slots of the object.
>   This is typically what I prefer: creating an object, then
>   operating on the object (reference) calling object methods
>   to access/modify slots.
>   So I'm wondering what (dis)advantages there are in
>   developing with S4 vs Reference Classes.
>   Things of interest:
>   Performance (i.e. memory management)
>   Integration compatibility with R packages
>   ??? other issues

I actually don't have much experience with Reference Classes and
(most) all of my R OO(P|D) with S4 (since I'm generally playing w/
bioconductor stuff, which has an S4 mandate).

I'm not sure exactly what you are after, but the way I design many of
my classes to enable them to have *some* pass by reference semantics
is to add a slot of type `environment` to the class def, like so:

setClass("Something",
  representation=representation(x='numeric', cache='environment'),
  prototype=prototype(x=numeric(), cache=new.env()))

Anything that gets put in `cache` is "passed by ref" so to speak. Consider this:

R> s1 <- new("Something", x=10)
R> s1@cache$by.reference <- 'there can be only 1'

R> s2 <- s1
R> s2@x
[1] 10

R> s2@x <- 12
R> s2@x
[1] 12

R> s1@x
[1] 10

R> s1@cache$by.reference
[1] "there can be only 1"

R> s2@cache$by.reference <- 'and then there were 2'
R> s2@cache$by.reference
[1] "and then there were 2"

R> s1@cache$by.reference
[1] "and then there were 2"


Proceed with caution ...

HTH,

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: S4 vs Reference Classes

Douglas Bates-2
In reply to this post by Joseph Park
On Tue, Sep 13, 2011 at 12:54 PM, Joseph Park <[hidden email]> wrote:

>   Hi, I'm looking for some guidance on whether to use
>   S4 or Reference Classes for an analysis application
>   I'm developing.
>   I'm a C++/Python developer, and like to 'think' in OOD.
>   I started my app with S4, thinking that was the best
>   set of OO features in R. However, it appears that one
>   needs Reference Classes to allow object methods to assign
>   values (other than the .Object in the initialize method)
>   to slots of the object.
>   This is typically what I prefer: creating an object, then
>   operating on the object (reference) calling object methods
>   to access/modify slots.
>   So I'm wondering what (dis)advantages there are in
>   developing with S4 vs Reference Classes.
>   Things of interest:
>   Performance (i.e. memory management)
>   Integration compatibility with R packages
>   ??? other issues

>From a C++/Python background you will probably feel more comfortable
with reference classes.  They are newer than S4 classes and much newer
than S3 "classes" (which aren't really classes) and methods.  Because
reference classes are newer the support for them has not been as fully
developed and you may encounter warts from time to time.

I use both reference classes and S4 classes.  Often I have objects
that represent model/data combinations for which the parameter
estimates are to be determined by optimizing a criterion.  In those
cases it makes sense to me to use reference classes because the state
of the object can be changed by a method.  I want to update the
parameters in the object and evaluate the estimation criterion without
needing to copy the entire object.  If you try to perform some kind of
update operation on an S4 object and not cheat in some way (i.e.
adhere to strict functional programming semantics) you need to create
a new instance of the object each time you update it.  When the object
is potentially very large you find yourself worrying about memory
usage if you take that route.  I found that my code started to look
pretty ugly because conceptually I was updating in place but the code
needs to be written as replacements.

Having said all that, you should realize that the style of programming
favored in R, and particularly in R packages, is to regard a method as
determined jointly by the generic function and the class(es) of the
argument(s).  This is different from most other object-oriented
languages in which the class is paramount and a method is just a
member of a class that happens to be code, not data.  You can get a
lot of mileage out of the idiom of defining methods for common
generics (print, plot, summary, ...) for particular S3 or S4 classes.
The structure of R packages favors S3 generics but you can define a
method for an S3 generic applied to an object from an S4 class.  The
only restriction is that S3 generics can only dispatch on the first
argument but that is what happens in a language where the methods are
part of the class definitions.  When you need multiple dispatch S4
generics and methods are worth the pain.

So my current approach is to use S4 classes for objects that are in
some way static but to use reference classes for objects that will
need to be updated when performing some kind of estimation (or other
such operations such as Markov chain Monte Carlo).

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: S4 vs Reference Classes

Douglas Bates-2
In reply to this post by Steve Lianoglou-6
On Tue, Sep 13, 2011 at 4:11 PM, Steve Lianoglou
<[hidden email]> wrote:

> Hi,
>
> On Tue, Sep 13, 2011 at 1:54 PM, Joseph Park <[hidden email]> wrote:
>>
>>   Hi, I'm looking for some guidance on whether to use
>>   S4 or Reference Classes for an analysis application
>>   I'm developing.
>>   I'm a C++/Python developer, and like to 'think' in OOD.
>>   I started my app with S4, thinking that was the best
>>   set of OO features in R. However, it appears that one
>>   needs Reference Classes to allow object methods to assign
>>   values (other than the .Object in the initialize method)
>>   to slots of the object.
>>   This is typically what I prefer: creating an object, then
>>   operating on the object (reference) calling object methods
>>   to access/modify slots.
>>   So I'm wondering what (dis)advantages there are in
>>   developing with S4 vs Reference Classes.
>>   Things of interest:
>>   Performance (i.e. memory management)
>>   Integration compatibility with R packages
>>   ??? other issues
>
> I actually don't have much experience with Reference Classes and
> (most) all of my R OO(P|D) with S4 (since I'm generally playing w/
> bioconductor stuff, which has an S4 mandate).
>
> I'm not sure exactly what you are after, but the way I design many of
> my classes to enable them to have *some* pass by reference semantics
> is to add a slot of type `environment` to the class def, like so:
>
> setClass("Something",
>  representation=representation(x='numeric', cache='environment'),
>  prototype=prototype(x=numeric(), cache=new.env()))
>
> Anything that gets put in `cache` is "passed by ref" so to speak. Consider this:
>
> R> s1 <- new("Something", x=10)
> R> s1@cache$by.reference <- 'there can be only 1'
>
> R> s2 <- s1
> R> s2@x
> [1] 10
>
> R> s2@x <- 12
> R> s2@x
> [1] 12
>
> R> s1@x
> [1] 10
>
> R> s1@cache$by.reference
> [1] "there can be only 1"
>
> R> s2@cache$by.reference <- 'and then there were 2'
> R> s2@cache$by.reference
> [1] "and then there were 2"
>
> R> s1@cache$by.reference
> [1] "and then there were 2"
>

That is essentially how reference classes are implemented (plus a lot
of syntactic sugar).

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: S4 vs Reference Classes

Martin Morgan
In reply to this post by Joseph Park
On 09/13/2011 10:54 AM, Joseph Park wrote:

>
>     Hi, I'm looking for some guidance on whether to use
>     S4 or Reference Classes for an analysis application
>     I'm developing.
>     I'm a C++/Python developer, and like to 'think' in OOD.
>     I started my app with S4, thinking that was the best
>     set of OO features in R. However, it appears that one
>     needs Reference Classes to allow object methods to assign
>     values (other than the .Object in the initialize method)
>     to slots of the object.

With

   setClass("A", representation=representation(slt="numeric"))

a slot can be updated with @<- and an object updated with a replacement
method

   setGeneric("slt<-", function(x, ..., value) standardGeneric("slt<-"))

   setReplaceMethod("slt", c("A", "numeric"), function(x, ..., value) {
       x@slt <- value
       x
   })

so

 > a = new("A", slt=1)
 > slt(a) = 2
 > a
An object of class "A"
Slot "slt":
[1] 2

The default initialize method also works as a copy constructor with
validity check, e.g., allowing multiple slot updates

   setReplaceMethod("slt", c("A", "ANY"), function(x, ..., value) {
       initialize(x, slt=as.numeric(value))
   })

 > slt(a) = "1"


>     This is typically what I prefer: creating an object, then
>     operating on the object (reference) calling object methods
>     to access/modify slots.
>     So I'm wondering what (dis)advantages there are in
>     developing with S4 vs Reference Classes.

R's copy-on-change semantics leads me to expect that

b = a
slt(a) = 2

leaves b unchanged, which S4 does (necessarily copying and thus with a
time and memory performance cost). A reference class might be
appropriate when the entity referred to exists in a single copy, as
e.g., an on-disk data base, or an external pointer to a C++ class.

Martin

>     Things of interest:
>     Performance (i.e. memory management)
>     Integration compatibility with R packages
>     ??? other issues
>     Thanks!
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: S4 vs Reference Classes

Joseph Park

   Thanks Martin.
   What i'm hoping to do is have a class object, with a member method
   that can change values of slots in the object, without having to
   assign values by external assignment to the object. Something like this:
   setClass ( "Element",
     representation ( x  = "numeric", y  = "numeric" ),

     prototype = list( x = 0, y = 1 )
   )
   setGeneric( name = "ComputeX",
               def  = function( self ) standardGeneric("ComputeX") )
   setMethod( "ComputeX", signature = "Element",
     function ( self ) {
       if ( self @ y > 0 ) {
         self @ x = pi
       }
     }
   )
   so that a call to the method ComputeX assigns ('internally') a
   value to the slot x of the global object.
   One can do :
   a = new( 'Element' )
   a @ x = 2
   but i would prefer to have a class method do the work without
   having to explicitly call a @ x = 2. Having to do this means that
   i need code in my main processing app that does things on slots
   that normally i would do in a class method.
   As I understand it, Reference Classes provide this. So i'm
   naturally wondering if i should switch my app from S4 to RC.
   Fundamentally, I don't clearly understand S4 and what the difference
   is between creating a SetReplaceMethod vs a SetMethod, since it
   seems that in either case one has to 'externally' assign the slot
   value. My limitation, of course.
   On 9/14/2011 12:17 AM, Martin Morgan wrote:

     On 09/13/2011 10:54 AM, Joseph Park wrote:

         Hi, I'm looking for some guidance on whether to use
         S4 or Reference Classes for an analysis application
         I'm developing.
         I'm a C++/Python developer, and like to 'think' in OOD.
         I started my app with S4, thinking that was the best
         set of OO features in R. However, it appears that one
         needs Reference Classes to allow object methods to assign
         values (other than the .Object in the initialize method)
         to slots of the object.

     With
       setClass("A", representation=representation(slt="numeric"))
     a slot can be updated with @<- and an object updated with a replacement
     method
       setGeneric("slt<-", function(x, ..., value) standardGeneric("slt<-"))
       setReplaceMethod("slt", c("A", "numeric"), function(x, ..., value) {
           x@slt <- value
           x
       })
     so
     > a = new("A", slt=1)
     > slt(a) = 2
     > a
     An object of class "A"
     Slot "slt":
     [1] 2
     The default initialize method also works as a copy constructor with
     validity check, e.g., allowing multiple slot updates
       setReplaceMethod("slt", c("A", "ANY"), function(x, ..., value) {
           initialize(x, slt=as.numeric(value))
       })
     > slt(a) = "1"

         This is typically what I prefer: creating an object, then
         operating on the object (reference) calling object methods
         to access/modify slots.
         So I'm wondering what (dis)advantages there are in
         developing with S4 vs Reference Classes.

     R's copy-on-change semantics leads me to expect that
     b = a
     slt(a) = 2
     leaves b unchanged, which S4 does (necessarily copying and thus with a
     time and memory performance cost). A reference class might be appropriate
     when the entity referred to exists in a single copy, as e.g., an on-disk
     data base, or an external pointer to a C++ class.
     Martin

         Things of interest:
         Performance (i.e. memory management)
         Integration compatibility with R packages
         ??? other issues
         Thanks!
     ______________________________________________
     [1][hidden email] mailing list
     [2]https://stat.ethz.ch/mailman/listinfo/r-help
     PLEASE do read the posting guide
     [3]http://www.R-project.org/posting-guide.html
     and provide commented, minimal, self-contained, reproducible code.

References

   1. mailto:[hidden email]
   2. https://stat.ethz.ch/mailman/listinfo/r-help
   3. http://www.R-project.org/posting-guide.html
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: S4 vs Reference Classes

Martin Morgan
On 09/14/2011 06:01 AM, Joseph Park wrote:

> Thanks Martin.
>
> What i'm hoping to do is have a class object, with a member method
> that can change values of slots in the object, without having to
> assign values by external assignment to the object. Something like this:
>
> setClass ( "Element",
> representation ( x = "numeric", y = "numeric" ),
>
> prototype = list( x = 0, y = 1 )
> )
>
> setGeneric( name = "ComputeX",
> def = function( self ) standardGeneric("ComputeX") )
>
> setMethod( "ComputeX", signature = "Element",
> function ( self ) {
> if ( self @ y > 0 ) {
> self @ x = pi
> }
> }
> )
>
> so that a call to the method ComputeX assigns ('internally') a
> value to the slot x of the global object.

Hi Joseph --

I understand. In R generally and in S4 in particular self@x = pi
triggers a 'copy-on-change', so self inside the function is now
different from self outside the function.

You either need to change your expectations, or use reference classes
(and change the expectations of your users).

For completeness, in your function above you would return self, and have

elt = ComputeX(elt)

you'd also likely implement some 'accessor' X (or better named) so

X(elt)

to get X. So there is no direct call to @ in your code.

It might help to understand a real use case; if it's just 'that's the
way other programming languages do it' then there isn't much more to
discuss. But maybe, like Doug Bates, you have a particular problem with
the paradigm?

> One can do :
> a = new( 'Element' )
> a @ x = 2
>
> but i would prefer to have a class method do the work without
> having to explicitly call a @ x = 2. Having to do this means that
> i need code in my main processing app that does things on slots
> that normally i would do in a class method.
>
> As I understand it, Reference Classes provide this. So i'm
> naturally wondering if i should switch my app from S4 to RC.
>
> Fundamentally, I don't clearly understand S4 and what the difference
> is between creating a SetReplaceMethod vs a SetMethod, since it
> seems that in either case one has to 'externally' assign the slot
> value. My limitation, of course.

at some level they are differences in syntax only, e.g.,

slt(a) = 2

versus

setGeneric("updt", function(x, value, ...) standardGeneric("updt"))
setMethod(updt, c("A", "numeric"), function(x, value, ...) {
     initialize(x, a=value)
})

and then

a = updt(a, 3)

The 'updt' model easily extends to multiple arguments; both represent an
abstraction between the API seen by the user, and the implementation of
the class, so there's no reason to store '3' directly.

Martin

>
>
> On 9/14/2011 12:17 AM, Martin Morgan wrote:
>> On 09/13/2011 10:54 AM, Joseph Park wrote:
>>>
>>> Hi, I'm looking for some guidance on whether to use
>>> S4 or Reference Classes for an analysis application
>>> I'm developing.
>>> I'm a C++/Python developer, and like to 'think' in OOD.
>>> I started my app with S4, thinking that was the best
>>> set of OO features in R. However, it appears that one
>>> needs Reference Classes to allow object methods to assign
>>> values (other than the .Object in the initialize method)
>>> to slots of the object.
>>
>> With
>>
>> setClass("A", representation=representation(slt="numeric"))
>>
>> a slot can be updated with @<- and an object updated with a
>> replacement method
>>
>> setGeneric("slt<-", function(x, ..., value) standardGeneric("slt<-"))
>>
>> setReplaceMethod("slt", c("A", "numeric"), function(x, ..., value) {
>> x@slt <- value
>> x
>> })
>>
>> so
>>
>> > a = new("A", slt=1)
>> > slt(a) = 2
>> > a
>> An object of class "A"
>> Slot "slt":
>> [1] 2
>>
>> The default initialize method also works as a copy constructor with
>> validity check, e.g., allowing multiple slot updates
>>
>> setReplaceMethod("slt", c("A", "ANY"), function(x, ..., value) {
>> initialize(x, slt=as.numeric(value))
>> })
>>
>> > slt(a) = "1"
>>
>>
>>> This is typically what I prefer: creating an object, then
>>> operating on the object (reference) calling object methods
>>> to access/modify slots.
>>> So I'm wondering what (dis)advantages there are in
>>> developing with S4 vs Reference Classes.
>>
>> R's copy-on-change semantics leads me to expect that
>>
>> b = a
>> slt(a) = 2
>>
>> leaves b unchanged, which S4 does (necessarily copying and thus with a
>> time and memory performance cost). A reference class might be
>> appropriate when the entity referred to exists in a single copy, as
>> e.g., an on-disk data base, or an external pointer to a C++ class.
>>
>> Martin
>>
>>> Things of interest:
>>> Performance (i.e. memory management)
>>> Integration compatibility with R packages
>>> ??? other issues
>>> Thanks!
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>


--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: S4 vs Reference Classes

Joseph Park

   Gentlemen: Steve, Martin & Doug:
   Thanks for the insightful comments regarding my query.
   I think that Martin and Doug have well assessed my position
   and both offer useful advice and have greatly improved
   my limited understanding of S4. Thanks!
   At this point, i'm well into the app via S4, and so will
   probably continue on. If the app finds wings, then i'll
   convert it to Reference Classes.
   Generally, my problem with S4 in an OO paradigm is that i
   need to add (what i consider) extra code in the main app
   environment to update object slots. As Doug points out:
   "If you try to perform some kind of
   update operation on an S4 object and not cheat in some way (i.e.
   adhere to strict functional programming semantics) you need to create
   a new instance of the object each time you update it."
   which is my issue. Without the reference-based approach an object
   in a slot which is then included in another object slot is a copy.
   An update to the original object slot then requires 'extra' code
   to update/synchronize the copy.
   This is not a complaint! I find R quite amazing and powerful.
   Next time I'll dive into the Reference Class methods, or perhaps
   as suggested, hybridize the current app.
   On 9/14/2011 12:02 PM, Martin Morgan wrote:

     On 09/14/2011 06:01 AM, Joseph Park wrote:

     Thanks Martin.
     What i'm hoping to do is have a class object, with a member method
     that can change values of slots in the object, without having to
     assign values by external assignment to the object. Something like this:
     setClass ( "Element",
     representation ( x = "numeric", y = "numeric" ),
     prototype = list( x = 0, y = 1 )
     )
     setGeneric( name = "ComputeX",
     def = function( self ) standardGeneric("ComputeX") )
     setMethod( "ComputeX", signature = "Element",
     function ( self ) {
     if ( self @ y > 0 ) {
     self @ x = pi
     }
     }
     )
     so that a call to the method ComputeX assigns ('internally') a
     value to the slot x of the global object.

     Hi Joseph --
     I understand. In R generally and in S4 in particular self@x = pi triggers
     a 'copy-on-change', so self inside the function is now different from self
     outside the function.
     You either need to change your expectations, or use reference classes (and
     change the expectations of your users).
     For completeness, in your function above you would return self, and have
     elt = ComputeX(elt)
     you'd also likely implement some 'accessor' X (or better named) so
     X(elt)
     to get X. So there is no direct call to @ in your code.
     It might help to understand a real use case; if it's just 'that's the way
     other programming languages do it' then there isn't much more to discuss.
     But  maybe, like Doug Bates, you have a particular problem with the
     paradigm?

     One can do :
     a = new( 'Element' )
     a @ x = 2
     but i would prefer to have a class method do the work without
     having to explicitly call a @ x = 2. Having to do this means that
     i need code in my main processing app that does things on slots
     that normally i would do in a class method.
     As I understand it, Reference Classes provide this. So i'm
     naturally wondering if i should switch my app from S4 to RC.
     Fundamentally, I don't clearly understand S4 and what the difference
     is between creating a SetReplaceMethod vs a SetMethod, since it
     seems that in either case one has to 'externally' assign the slot
     value. My limitation, of course.

     at some level they are differences in syntax only, e.g.,
     slt(a) = 2
     versus
     setGeneric("updt", function(x, value, ...) standardGeneric("updt"))
     setMethod(updt, c("A", "numeric"), function(x, value, ...) {
         initialize(x, a=value)
     })
     and then
     a = updt(a, 3)
     The 'updt' model easily extends to multiple arguments; both represent an
     abstraction between the API seen by the user, and the implementation of
     the class, so there's no reason to store '3' directly.
     Martin

     On 9/14/2011 12:17 AM, Martin Morgan wrote:

     On 09/13/2011 10:54 AM, Joseph Park wrote:

     Hi, I'm looking for some guidance on whether to use
     S4 or Reference Classes for an analysis application
     I'm developing.
     I'm a C++/Python developer, and like to 'think' in OOD.
     I started my app with S4, thinking that was the best
     set of OO features in R. However, it appears that one
     needs Reference Classes to allow object methods to assign
     values (other than the .Object in the initialize method)
     to slots of the object.

     With
     setClass("A", representation=representation(slt="numeric"))
     a slot can be updated with @<- and an object updated with a
     replacement method
     setGeneric("slt<-", function(x, ..., value) standardGeneric("slt<-"))
     setReplaceMethod("slt", c("A", "numeric"), function(x, ..., value) {
     x@slt <- value
     x
     })
     so
     > a = new("A", slt=1)
     > slt(a) = 2
     > a
     An object of class "A"
     Slot "slt":
     [1] 2
     The default initialize method also works as a copy constructor with
     validity check, e.g., allowing multiple slot updates
     setReplaceMethod("slt", c("A", "ANY"), function(x, ..., value) {
     initialize(x, slt=as.numeric(value))
     })
     > slt(a) = "1"

     This is typically what I prefer: creating an object, then
     operating on the object (reference) calling object methods
     to access/modify slots.
     So I'm wondering what (dis)advantages there are in
     developing with S4 vs Reference Classes.

     R's copy-on-change semantics leads me to expect that
     b = a
     slt(a) = 2
     leaves b unchanged, which S4 does (necessarily copying and thus with a
     time and memory performance cost). A reference class might be
     appropriate when the entity referred to exists in a single copy, as
     e.g., an on-disk data base, or an external pointer to a C++ class.
     Martin

     Things of interest:
     Performance (i.e. memory management)
     Integration compatibility with R packages
     ??? other issues
     Thanks!
     ______________________________________________
     [1][hidden email] mailing list
     [2]https://stat.ethz.ch/mailman/listinfo/r-help
     PLEASE do read the posting guide
     [3]http://www.R-project.org/posting-guide.html
     and provide commented, minimal, self-contained, reproducible code.

References

   1. mailto:[hidden email]
   2. https://stat.ethz.ch/mailman/listinfo/r-help
   3. http://www.R-project.org/posting-guide.html
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: S4 vs Reference Classes

Steve Lianoglou-6
Hi,

Just wanted to say that embedding a slot in your class that's an
environment (as I shown earlier) will still solve your problem w/o you
having to switch to Ref classes (since you've already done lots of
work for your app in S4).

Let's assume you have a slot `cache` that is an environment, using
your latests examples, let's say it's like this:

setClass("Element",
 representation=representation(x='numeric', cache='environment'),
 prototype=prototype(x=numeric(), cache=new.env()))

Let's say "gradient" is something you want to be access by reference,
you can have something like this (setGenerics left out for lack of
time):

setMethod("gradient", "Element", function(x, ...) {
  if (!'gradient' %in% ls(x@cache)) {
    x@cache$gradient <- calc.gradient.from.element(x)
  }
  x@cache$gradient
})

Then a call to `gradient(my.obj)` will return the gradient if it
already calculated, or it will calc it on the fly and set it into your
object (w/o copying your object) and return it when it's done.

> which is my issue. Without the reference-based approach an object
> in a slot which is then included in another object slot is a copy.
> An update to the original object slot then requires 'extra' code
> to update/synchronize the copy.

Again, this "semi-s4-semi-ref-class" approach would run around this
issue .. but life might get confusing to you (or your users) depending
on what one expects as "normal" behavioR.

Just wanted to try to clear up my original intention (if it wasn't
clear before).

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: S4 vs Reference Classes

Steve Lianoglou-6
Hi Joseph (and Martin),

Don't mean to beat a dead horse, but I wanted to add one last comment
to this thread in case someone stumbles upon this via google/gmane (or
you) and gives it a shot.

I neglected to mention a very important step that you'd have to do to
in order to avoid shooting yourself in the foot.

Martin, off list, thankfully pointed out to me that you still need to
define an "initialize" method for your class so that each @cache slot
for every new object defined gets *its own* environment. If you don't,
they all share the same environment when you create new objects
through a call to `new("Element")`.

Here's what happens and how to fix ... it's intentionally a bit
verbose for pedagogical purposes, so please bear with me:

R> setClass("Element",
 representation=representation(x='numeric', cache='environment'),
 prototype=prototype(x=numeric(), cache=new.env()))

R> a <- new("Element")
R> b <- new("Element")

If we look at the cache object in both `a` and `b`, you'll see that
they actually are *the same* environment:

R> a@cache
<environment: 0x100a23788>

R> b@cache
<environment: 0x100a23788>

See -- those two environments share the same address. So, if you do:

R> a@cache$some.var <- 42
R> a@cache$some.var
[1] 42

R> b@cache$some.var
[1] 42

¡Yikes!

If you explicitly set the cache slot to a `new.env()` you can avoid this:

R> a <- new("Element", cache=new.env())
R> b <- new("Element", cache=new.env())
R> a@cache
<environment: 0x10214d5b8>
R> b@cache
<environment: 0x100eff908>

You see the two environments are different, so setting a var into one
@cache won't affect the other:

R> a@cache$some.var <- 42
R> b@cache$some.var
NULL

So that's what you want, but who wants to keep typing new("Element",
cache=new.env())? Not me, so that's what initialize methods are for.
These are what the ones I have in my libs look like:

setMethod("initialize", "Element",
  function(.Object, ..., x=numeric(), cache=new.env()) {
    callNextMethod(.Object, x=x, cache=cache, ...)
})

Now, with those loaded up:

R> aa <- new("Element")
R> bb <- new("Element")
R> aa@cache
<environment: 0x10312e3f8>

R> bb@cache
<environment: 0x103251ae0>

Problem solved.

Martin suggested a slightly different version of "initialize", like so:

setMethod(initialize, "Element", function(.Object, ...) {
   callNextMethod(.Object, ..., cache=new.env(parent=emptyenv()))
})

Where he mentions "... with parent=emptyenv() to avoid searching
outside the cache during symbol look-up".

I actually never used that, and don't think I ran into problems (I
always set `inherits=FALSE` if I'm `get`-ing something out of an
environment), but I'd go with his advice over mine any day.

So ...

(i) thanks to Martin for pointing that out; and
(ii) thanks for bearing with me here,

I'll stop now :-)

-steve

On Wed, Sep 14, 2011 at 4:24 PM, Joseph Park <[hidden email]> wrote:

> Thanks Steve.
>
> I'll take a closer look at this.
>
> all the best...
>
>
> On 9/14/2011 4:18 PM, Steve Lianoglou wrote:
>
> Hi,
>
> Just wanted to say that embedding a slot in your class that's an
> environment (as I shown earlier) will still solve your problem w/o you
> having to switch to Ref classes (since you've already done lots of
> work for your app in S4).
>
> Let's assume you have a slot `cache` that is an environment, using
> your latests examples, let's say it's like this:
>
> setClass("Element",
>  representation=representation(x='numeric', cache='environment'),
>  prototype=prototype(x=numeric(), cache=new.env()))
>
> Let's say "gradient" is something you want to be access by reference,
> you can have something like this (setGenerics left out for lack of
> time):
>
> setMethod("gradient", "Element", function(x, ...) {
>   if (!'gradient' %in% ls(x@cache)) {
>     x@cache$gradient <- calc.gradient.from.element(x)
>   }
>   x@cache$gradient
> })
>
> Then a call to `gradient(my.obj)` will return the gradient if it
> already calculated, or it will calc it on the fly and set it into your
> object (w/o copying your object) and return it when it's done.
>
> which is my issue. Without the reference-based approach an object
> in a slot which is then included in another object slot is a copy.
> An update to the original object slot then requires 'extra' code
> to update/synchronize the copy.
>
> Again, this "semi-s4-semi-ref-class" approach would run around this
> issue .. but life might get confusing to you (or your users) depending
> on what one expects as "normal" behavioR.
>
> Just wanted to try to clear up my original intention (if it wasn't
> clear before).
>
> -steve
>
>



--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.