Quantcast

Weird behavior with S4 subclasses of data.table after loading RCurl

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Weird behavior with S4 subclasses of data.table after loading RCurl

Jeffrey Arnold
This is a repost from http://stackoverflow.com/questions/12655600/weird-behavior-with-data-table-devtools-and-s4, but now I'm using version 1.8.2 of data.table. 

I am getting some really weird behavior when trying to write an S4 subclass of data.table. In short, after loading the RCurl package, the "[" method no longer can find variable names in the scope of the data table. I originally found this out while developing a package with the devtools package, and traced the problem to importing RCurlRCurl and data.table are up to date.

At this point, I have no idea what's going on, but I think the following code is the best minimal example that reproduces this behavior.

> library("data.table")

data.table 1.8.2  For help type: help("data.table")

> sessionInfo()

R version 2.15.1 (2012-06-22)

Platform: x86_64-unknown-linux-gnu (64-bit)


locale:

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              

 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    

 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   

 [7] LC_PAPER=C                 LC_NAME=C                 

 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       


attached base packages:

[1] stats     graphics  grDevices utils     datasets  methods   base     


other attached packages:

[1] data.table_1.8.2

> setClass("DataTable2", contains="data.table")

> DT1 <- data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)

> DT2 <- new("DataTable2", data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9))

> ## Everything works!

> tables()

     NAME NROW MB COLS  KEY

[1,] DT1     9 1  x,y,v    

[2,] DT2     9 1  x,y,v    

Total: 2MB

> is(DT2, "data.table")

[1] TRUE

> DT1[2]

   x y v

1: a 3 2

> DT2[2]

   x y v

1: a 3 2

> (bracketMethods1 <- methods("["))

 [1] [.acf*            [.AsIs            [.bibentry*       [.data.frame     

 [5] [.data.table*     [.Date            [.difftime        [.factor         

 [9] [.formula*        [.getAnywhere*    [.hexmode         [.ITime*         

[13] [.listof          [.noquote         [.numeric_version [.octmode        

[17] [.pdf_doc*        [.person*         [.POSIXct         [.POSIXlt        

[21] [.raster*         [.roman*          [.simple.list     [.terms*         

[25] [.ts*             [.tskernel*      


   Non-visible functions are asterisked

> DT1[,v]

[1] 1 2 3 4 5 6 7 8 9

> DT2[,v]

[1] 1 2 3 4 5 6 7 8 9

> ## These are the packages loaded/imported by RCurl (and [.data.table works after them).

> ## library("tools")

> ## DT2[,v]

> ## library("bitops")

> ## DT2[,v]

> library("RCurl")

Loading required package: bitops

> sessionInfo()

R version 2.15.1 (2012-06-22)

Platform: x86_64-unknown-linux-gnu (64-bit)


locale:

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              

 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    

 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   

 [7] LC_PAPER=C                 LC_NAME=C                 

 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       


attached base packages:

[1] stats     graphics  grDevices utils     datasets  methods   base     


other attached packages:

[1] RCurl_1.95-0     bitops_1.0-4.1   data.table_1.8.2


loaded via a namespace (and not attached):

[1] tools_2.15.1

> ## This still works

> DT1[,v]

[1] 1 2 3 4 5 6 7 8 9

> ## This no longer works

> DT2[,v]

Error: object 'v' not found

> ## No changes in the extract S3 methods 

> (bracketMethods2 <- methods("["))

 [1] [.acf*            [.AsIs            [.bibentry*       [.data.frame     

 [5] [.data.table*     [.Date            [.difftime        [.factor         

 [9] [.formula*        [.getAnywhere*    [.hexmode         [.ITime*         

[13] [.listof          [.noquote         [.numeric_version [.octmode        

[17] [.pdf_doc*        [.person*         [.POSIXct         [.POSIXlt        

[21] [.raster*         [.roman*          [.simple.list     [.terms*         

[25] [.ts*             [.tskernel*      


   Non-visible functions are asterisked

> setdiff(bracketMethods1, bracketMethods2)

character(0)

> setdiff(bracketMethods2, bracketMethods1)

character(0)


Jeff

---
Jeffrey Arnold
Department of Political Science
University of Rochester


_______________________________________________
datatable-help mailing list
[hidden email]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Weird behavior with S4 subclasses of data.table after loading RCurl

Matthew Dowle

Hi,

Quick answer to start. Assuming your package Imports or Depends on
data.table (in the DESCRIPTON file) then see here for description of
`cedta` and how it works and maybe it needs a fix :

http://stackoverflow.com/a/10529888/403310

Matthew


> This is a repost from
> http://stackoverflow.com/questions/12655600/weird-behavior-with-data-table-devtools-and-s4,
> but now I'm using version 1.8.2 of data.table.
>
> I am getting some really weird behavior when trying to write an S4
> subclass
> of *data.table*. In short, after loading the *RCurl* package, the "["
> method no longer can find variable names in the scope of the data table. I
> originally found this out while developing a package with the
> *devtools* package,
> and traced the problem to importing *RCurl*. *RCurl* and *data.table* are
> up to date.
>
> At this point, I have no idea what's going on, but I think the following
> code is the best minimal example that reproduces this behavior.
>
>> library("data.table")
>
> data.table 1.8.2  For help type: help("data.table")
>
>> sessionInfo()
>
> R version 2.15.1 (2012-06-22)
>
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
>
> locale:
>
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>
>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>
>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>
>  [7] LC_PAPER=C                 LC_NAME=C
>
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
>
> attached base packages:
>
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
>
> other attached packages:
>
> [1] data.table_1.8.2
>
>>
>
>> setClass("DataTable2", contains="data.table")
>
>>
>
>> DT1 <- data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
>
>> DT2 <- new("DataTable2", data.table(x=rep(c("a","b","c"),each=3),
> y=c(1,3,6), v=1:9))
>
>>
>
>> ## Everything works!
>
>>
>
>> tables()
>
>      NAME NROW MB COLS  KEY
>
> [1,] DT1     9 1  x,y,v
>
> [2,] DT2     9 1  x,y,v
>
> Total: 2MB
>
>> is(DT2, "data.table")
>
> [1] TRUE
>
>>
>
>> DT1[2]
>
>    x y v
>
> 1: a 3 2
>
>> DT2[2]
>
>    x y v
>
> 1: a 3 2
>
>>
>
>> (bracketMethods1 <- methods("["))
>
>  [1] [.acf*            [.AsIs            [.bibentry*       [.data.frame
>
>  [5] [.data.table*     [.Date            [.difftime        [.factor
>
>  [9] [.formula*        [.getAnywhere*    [.hexmode         [.ITime*
>
> [13] [.listof          [.noquote         [.numeric_version [.octmode
>
> [17] [.pdf_doc*        [.person*         [.POSIXct         [.POSIXlt
>
> [21] [.raster*         [.roman*          [.simple.list     [.terms*
>
> [25] [.ts*             [.tskernel*
>
>
>    Non-visible functions are asterisked
>
>>
>
>> DT1[,v]
>
> [1] 1 2 3 4 5 6 7 8 9
>
>> DT2[,v]
>
> [1] 1 2 3 4 5 6 7 8 9
>
>>
>
>> ## These are the packages loaded/imported by RCurl (and [.data.table
> works after them).
>
>> ## library("tools")
>
>> ## DT2[,v]
>
>> ## library("bitops")
>
>> ## DT2[,v]
>
>>
>
>> library("RCurl")
>
> Loading required package: bitops
>
>> sessionInfo()
>
> R version 2.15.1 (2012-06-22)
>
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
>
> locale:
>
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>
>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>
>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>
>  [7] LC_PAPER=C                 LC_NAME=C
>
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
>
> attached base packages:
>
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
>
> other attached packages:
>
> [1] RCurl_1.95-0     bitops_1.0-4.1   data.table_1.8.2
>
>
> loaded via a namespace (and not attached):
>
> [1] tools_2.15.1
>
>>
>
>> ## This still works
>
>> DT1[,v]
>
> [1] 1 2 3 4 5 6 7 8 9
>
>> ## This no longer works
>
>> DT2[,v]
>
> Error: object 'v' not found
>
>>
>
>> ## No changes in the extract S3 methods
>
>> (bracketMethods2 <- methods("["))
>
>  [1] [.acf*            [.AsIs            [.bibentry*       [.data.frame
>
>  [5] [.data.table*     [.Date            [.difftime        [.factor
>
>  [9] [.formula*        [.getAnywhere*    [.hexmode         [.ITime*
>
> [13] [.listof          [.noquote         [.numeric_version [.octmode
>
> [17] [.pdf_doc*        [.person*         [.POSIXct         [.POSIXlt
>
> [21] [.raster*         [.roman*          [.simple.list     [.terms*
>
> [25] [.ts*             [.tskernel*
>
>
>    Non-visible functions are asterisked
>
>> setdiff(bracketMethods1, bracketMethods2)
>
> character(0)
>
>> setdiff(bracketMethods2, bracketMethods1)
>
> character(0)
>
>
> Jeff
>
> ---
> Jeffrey Arnold
> Department of Political Science
> University of Rochester
> http://jrnold.me
> [hidden email]
> [hidden email]
> _______________________________________________
> datatable-help mailing list
> [hidden email]
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help


_______________________________________________
datatable-help mailing list
[hidden email]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Weird behavior with S4 subclasses of data.table after loading RCurl

Steve Lianoglou-6
Hi,

On Sat, Sep 29, 2012 at 6:21 PM, Matthew Dowle <[hidden email]> wrote:
>
> Hi,
>
> Quick answer to start. Assuming your package Imports or Depends on
> data.table (in the DESCRIPTON file) then see here for description of
> `cedta` and how it works and maybe it needs a fix :
>
> http://stackoverflow.com/a/10529888/403310

Quick note: this doesn't look like it has to do w/ cedta ... I was
debugging this w/ the sample provided and cedta() returns TRUE.

Somehow ~ line 780 of data.table.R, `xvars` is empty and I guess the
column is injected into the SDenv befure the `jval = eval(jsub,SDenv)`
call, so its not found.

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
_______________________________________________
datatable-help mailing list
[hidden email]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Weird behavior with S4 subclasses of data.table after loading RCurl

Jeffrey Arnold
While I came across this while working on a package that used S4, the code in the OP has those errors when run in the global environment.  In the output below, I put data.table:::cedta() before and after loading RCurl, and it returns TRUE in both cases. The data.table object DT1 works fine before and after loading RCurl; DT2, object of the S4 class inheriting from data.table, is acting like a data.table object and not like a data.frame in all respects that I've looked at, except when I use an unquoted variable name in i. That works fine before loading RCurl, but fails after it. I have no idea what RCurl could be changing that would affect this, which is why I am stumped.

> library("data.table")
data.table 1.8.2  For help type: help("data.table")
> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.8.2
> ## cedta is TRUE
> data.table:::cedta()
[1] TRUE
> setClass("DataTable2", contains="data.table")
> DT1 <- data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
> DT2 <- new("DataTable2", data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9))
> library("RCurl")
Loading required package: bitops
> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RCurl_1.95-0     bitops_1.0-4.1   data.table_1.8.2
> ## cedta is still TRUE
> data.table:::cedta()
[1] TRUE
> ## This still works
> DT1[,v]
[1] 1 2 3 4 5 6 7 8 9
> ## This no longer works
> DT2[,v]
Error: object 'v' not found
> ## DT2 still behaving like data.table and not data.frame in other respects
> DT2[ , 4]
[1] 4
> DT2[ , "v"]  # returns v instead of the column
[1] "v"
> DT2[ , "v", with=FALSE] # returns the column
   v
1: 1
2: 2
3: 3
4: 4
5: 5
6: 6
7: 7
8: 8
9: 9


On Sat, Sep 29, 2012 at 8:34 PM, Steve Lianoglou <[hidden email]> wrote:
Hi,

On Sat, Sep 29, 2012 at 6:21 PM, Matthew Dowle <[hidden email]> wrote:
>
> Hi,
>
> Quick answer to start. Assuming your package Imports or Depends on
> data.table (in the DESCRIPTON file) then see here for description of
> `cedta` and how it works and maybe it needs a fix :
>
> http://stackoverflow.com/a/10529888/403310

Quick note: this doesn't look like it has to do w/ cedta ... I was
debugging this w/ the sample provided and cedta() returns TRUE.

Somehow ~ line 780 of data.table.R, `xvars` is empty and I guess the
column is injected into the SDenv befure the `jval = eval(jsub,SDenv)`
call, so its not found.

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact


_______________________________________________
datatable-help mailing list
[hidden email]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Weird behavior with S4 subclasses of data.table after loading RCurl

Jeffrey Arnold
And what's even more unusual is that for the line DT2[ , v],  the function`[.data.table` is never called.  I included debug(data.table:::`[.data.table`) and nothing happened for DT2[ , v], but it was activated for DT2[ , 2] and the other evaluations of extract on DT2.  My hunch is that the since the S4 method is dispatching the method with a signature that includes the classes for i and j, something is going wrong there since it is only getting awry when the class of j is a name.  I'm going to test out that hunch.  

On Sat, Sep 29, 2012 at 9:40 PM, Jeffrey Arnold <[hidden email]> wrote:
While I came across this while working on a package that used S4, the code in the OP has those errors when run in the global environment.  In the output below, I put data.table:::cedta() before and after loading RCurl, and it returns TRUE in both cases. The data.table object DT1 works fine before and after loading RCurl; DT2, object of the S4 class inheriting from data.table, is acting like a data.table object and not like a data.frame in all respects that I've looked at, except when I use an unquoted variable name in i. That works fine before loading RCurl, but fails after it. I have no idea what RCurl could be changing that would affect this, which is why I am stumped.

> library("data.table")
data.table 1.8.2  For help type: help("data.table")
> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.8.2
> ## cedta is TRUE
> data.table:::cedta()
[1] TRUE
> setClass("DataTable2", contains="data.table")
> DT1 <- data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
> DT2 <- new("DataTable2", data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9))
> library("RCurl")
Loading required package: bitops
> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RCurl_1.95-0     bitops_1.0-4.1   data.table_1.8.2
> ## cedta is still TRUE
> data.table:::cedta()
[1] TRUE
> ## This still works
> DT1[,v]
[1] 1 2 3 4 5 6 7 8 9
> ## This no longer works
> DT2[,v]
Error: object 'v' not found
> ## DT2 still behaving like data.table and not data.frame in other respects
> DT2[ , 4]
[1] 4
> DT2[ , "v"]  # returns v instead of the column
[1] "v"
> DT2[ , "v", with=FALSE] # returns the column
   v
1: 1
2: 2
3: 3
4: 4
5: 5
6: 6
7: 7
8: 8
9: 9


On Sat, Sep 29, 2012 at 8:34 PM, Steve Lianoglou <[hidden email]> wrote:
Hi,

On Sat, Sep 29, 2012 at 6:21 PM, Matthew Dowle <[hidden email]> wrote:
>
> Hi,
>
> Quick answer to start. Assuming your package Imports or Depends on
> data.table (in the DESCRIPTON file) then see here for description of
> `cedta` and how it works and maybe it needs a fix :
>
> http://stackoverflow.com/a/10529888/403310

Quick note: this doesn't look like it has to do w/ cedta ... I was
debugging this w/ the sample provided and cedta() returns TRUE.

Somehow ~ line 780 of data.table.R, `xvars` is empty and I guess the
column is injected into the SDenv befure the `jval = eval(jsub,SDenv)`
call, so its not found.

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



_______________________________________________
datatable-help mailing list
[hidden email]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Weird behavior with S4 subclasses of data.table after loading RCurl

Steve Lianoglou-6
Howdy,

The even stranger thing is that DT2[,v] never actually works for me ... with or without RCurl having been loaded.

I'm not sure your dispatching hunch is going to be it, though, because in my case [.data.table is always being called, but by all means please check it out.

My initial hunch would have been that you would have to `setMethod("[", "DataTable2", ...) but I guess not.

I'm not sure I've come across similar attempts tho, where an S4 inherits from an S3, have you? it might be helpful to see how that was wired up ...  I'm semi-surprised that this works at all :-) I think I only ever expected people to use a data.table object as a slot for another S4 object, not as its parent.

Interesting, though. Let's see ...

-steve

On Saturday, September 29, 2012, Jeffrey Arnold wrote:
And what's even more unusual is that for the line DT2[ , v],  the function`[.data.table` is never called.  I included debug(data.table:::`[.data.table`) and nothing happened for DT2[ , v], but it was activated for DT2[ , 2] and the other evaluations of extract on DT2.  My hunch is that the since the S4 method is dispatching the method with a signature that includes the classes for i and j, something is going wrong there since it is only getting awry when the class of j is a name.  I'm going to test out that hunch.  

On Sat, Sep 29, 2012 at 9:40 PM, Jeffrey Arnold <[hidden email]> wrote:
While I came across this while working on a package that used S4, the code in the OP has those errors when run in the global environment.  In the output below, I put data.table:::cedta() before and after loading RCurl, and it returns TRUE in both cases. The data.table object DT1 works fine before and after loading RCurl; DT2, object of the S4 class inheriting from data.table, is acting like a data.table object and not like a data.frame in all respects that I've looked at, except when I use an unquoted variable name in i. That works fine before loading RCurl, but fails after it. I have no idea what RCurl could be changing that would affect this, which is why I am stumped.

> library("data.table")
data.table 1.8.2  For help type: help("data.table")
> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.8.2
> ## cedta is TRUE
> data.table:::cedta()
[1] TRUE
> setClass("DataTable2", contains="data.table")
> DT1 <- data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
> DT2 <- new("DataTable2", data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9))
> library("RCurl")
Loading required package: bitops
> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   


--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

_______________________________________________
datatable-help mailing list
[hidden email]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Weird behavior with S4 subclasses of data.table after loading RCurl

Jeffrey Arnold
(Sorry, Steve; I realized that I originally replied to you instead of the list)

Okay, the good news is that I know what's going on; the bad news is that there exists no fix that doesn't break the existing syntax of data.table.

RCurl was a red-herring.  I'm almost certain that loading any library that adds new S4 "[" methods will trigger this behavior. E.g. "Matrix", etc.  The reason that my code never worked when you ran it is probably because you had already loading some class like that before running my code.

What is happening due to the difference in the way that S3 and S4 check the signatures before method dispatch and R's use of lazy evaluation.  Since S3 only checks the first argument, it never evaluates j, which allows data.table to do its cool things with expressions in that argument.  Because the S4 "[" method checks the classes of x, i, and j before dispatching it must evaluate j.  If j is an expression it either throws an error, or it will not, but it will do the unintended thing of evaluating the expression in the calling frame instead of within the data.table.  

As far as I can tell, there is no way to fix this without altering the syntax of data.table.  

I do have the following workaround.  Adding the following S4 methods allows the use of quoted expressions in j for S4 classes inheriting from data.table that act like unquoted expressions for the S3 data.table.  

setMethod("[", c(x="data.table", i="ANY", j="ANY"),
          function(x, i, j, ...) callNextMethod(...))
setMethod("[", c(x="data.table", j="language"),
          function(x, i, j, ...) data.table(x)[j=eval(j), ...])

E.g. 

> library("Matrix")

> setClass("DataTable2", contains="data.table")
> setMethod("[", c(x="data.table", i="ANY", j="ANY"),
+           function(x, i, j, ...) callNextMethod(...))
[1] "["
> setMethod("[", c(x="data.table", j="language"),
+           function(x, i, j, ...) data.table(x)[j=eval(j), ...])
[1] "["
> ## This still doesnt work.

> DT2[,v]
Error: object 'v' not found
> ## This does work
> DT2[,quote(v)]

[1] 1 2 3 4 5 6 7 8 9
> DT2[,quote(sum(v))]
[1] 45
 
I hadn't realized that I was doing something unintended when I started, or maybe I wouldn't have :-)  Now R supports S4 classes inheriting from S3 classes pretty well, so it seemed like a good idea at the time.   The S4 class I am actually writing is for storing / manipulating MCMC samples. One way to do that is to have a data.frame like object with specific columns, e.g. "chain", "iteration", "parameter", ..., and then add functions that take advantage of this known structure.  I want to inherit from the data.frame directly so that it can make use of all the generic functions defined for the data.frame.  It is more intuitive to use object[...] rather than object@someSlotName[...].That all works great, except that these get samples can get pretty big, so, of course, I want the performance of data.table :-) if I can have it.

_______________________________________________
datatable-help mailing list
[hidden email]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Weird behavior with S4 subclasses of data.table after loading RCurl

Steve Lianoglou-6
Hi Jeffrey,

On Mon, Oct 1, 2012 at 1:39 PM, Jeffrey Arnold <[hidden email]> wrote:
> (Sorry, Steve; I realized that I originally replied to you instead of the
> list)
[snip]
> I hadn't realized that I was doing something unintended when I started, or
> maybe I wouldn't have :-)

Actually, it wasn't so unintended after all -- I had written a trivial
test (inst/tests/test-S4.R) to see that we could inherit (contains)
from data.table, but I never kicked the tires with "[" and stuff, so
...

>  Now R supports S4 classes inheriting from S3
> classes pretty well, so it seemed like a good idea at the time.   The S4
> class I am actually writing is for storing / manipulating MCMC samples. One
> way to do that is to have a data.frame like object with specific columns,
> e.g. "chain", "iteration", "parameter", ..., and then add functions that
> take advantage of this known structure.  I want to inherit from the
> data.frame directly so that it can make use of all the generic functions
> defined for the data.frame.  It is more intuitive to use object[...] rather
> than object@someSlotName[...].That all works great, except that these get
> samples can get pretty big, so, of course, I want the performance of
> data.table :-) if I can have it.

I agree that it would be handy to do what you want this way ... I am
unfortunately a bit short on time to help you dig into this at the
moment.

I think the "[" methods you are defining for data.table are on the
right track -- perhaps it will be a good idea to include these into
the data.table package and export them. Still, one package
simultaneously (and fully) supporting s3 and s4 might be a bit ...
something.

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
_______________________________________________
datatable-help mailing list
[hidden email]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Weird behavior with S4 subclasses of data.table after loading RCurl

Jeffrey Arnold
Hi Steve, 

For some reason I didn't think to look at the tests; I did take a browse at AllS4.R to see what was being defined. At least from my perspective, there isn't anything you have to do. At this point, I know what the problem is and how to work around it.  There shouldn't be problems when data.table objects are used as slots for S4 classes, because R will know it is an S3 object and only look at x when dispatching [. 

But it really appears to me that as data.table is written, there is no way for S4 classes directly extending data.table to use [ with the exact same syntax. As far as I can tell (and I could be wrong), as long as data.table allows for unquoted expressions in i and j, the S4 method [ will never be able to work. But there is too much code that relies on that syntax for data.table to change it, so there really isn't much that can be done.  The next best thing is adding the S4 definitions for [ methods that at least allow the use of quoted expressions, and then warn people about in the FAQ. 

I think the following method definitions will catch any case with quoted expressions in i or j and make sure they get handled correctly. I can send a true patch and write some test cases if you want. 

setMethod("[", c(x="data.table", i="language", j="missing"),
          function(x, i, j, ...) data.table(x)[i=eval(i), ...])

setMethod("[", c(x="data.table", i="missing", j="language"),
          function(x, i, j, ...) data.table(x)[j=eval(j), ...])

setMethod("[", c(x="data.table", i="language", j="language"),
          function(x, i, j, ...) data.table(x)[i=eval(i), j=eval(j), ...])

Jeff

On Mon, Oct 1, 2012 at 2:26 PM, Steve Lianoglou <[hidden email]> wrote:
Hi Jeffrey,

On Mon, Oct 1, 2012 at 1:39 PM, Jeffrey Arnold <[hidden email]> wrote:
> (Sorry, Steve; I realized that I originally replied to you instead of the
> list)
[snip]
> I hadn't realized that I was doing something unintended when I started, or
> maybe I wouldn't have :-)

Actually, it wasn't so unintended after all -- I had written a trivial
test (inst/tests/test-S4.R) to see that we could inherit (contains)
from data.table, but I never kicked the tires with "[" and stuff, so
...

>  Now R supports S4 classes inheriting from S3
> classes pretty well, so it seemed like a good idea at the time.   The S4
> class I am actually writing is for storing / manipulating MCMC samples. One
> way to do that is to have a data.frame like object with specific columns,
> e.g. "chain", "iteration", "parameter", ..., and then add functions that
> take advantage of this known structure.  I want to inherit from the
> data.frame directly so that it can make use of all the generic functions
> defined for the data.frame.  It is more intuitive to use object[...] rather
> than object@someSlotName[...].That all works great, except that these get
> samples can get pretty big, so, of course, I want the performance of
> data.table :-) if I can have it.

I agree that it would be handy to do what you want this way ... I am
unfortunately a bit short on time to help you dig into this at the
moment.

I think the "[" methods you are defining for data.table are on the
right track -- perhaps it will be a good idea to include these into
the data.table package and export them. Still, one package
simultaneously (and fully) supporting s3 and s4 might be a bit ...
something.

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact


_______________________________________________
datatable-help mailing list
[hidden email]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Loading...