Retaining attributes of columns of a data frame when subsetting.

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Retaining attributes of columns of a data frame when subsetting.

Rolf Turner

I am writing a function that involves a data frame "X" some columns of
which have attributes.  I replace X by a data frame with a subset of the
rows of X:

     X <- X[ok,]

where "ok" is a logical vector.  When I do this the attributes of the
columns (which I need to retain) are lost (except for the "class" and
"levels" attributes of columns which are factors).

Is there any sexy way to retain the attributes of the columns?

So far the only approach that I can work out is to extract the
attributes prior to subsetting and put them back after subsetting.

Like unto:

     SaveAt <- lapply(X,attributes)
     X <- X[ok,]
     lX <- lapply(names(X),function(nm,x,Sat){
                                attributes(x[[nm]]) <- Sat[[nm]]
                                x[[nm]]},x=X,Sat=SaveAt)
     names(lX) <- names(X)
     X <- as.data.frame(lX)

This seems to work, but is rather kludgy.  Is there a better way?

Thanks for any pointers.

cheers,

Rolf Turner

--
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Retaining attributes of columns of a data frame when subsetting.

Richard M. Heiberger
Look at
methods(as.data.frame)
Define your specialized columns to have a newly defined class, say "myclass".
Then write as.data.frame.myclass
It will be similar to the function you already have in the lapply statement.
Now your statement
X <- X[ok,]
should work.

Rich

On Sat, Oct 19, 2019 at 8:20 PM Rolf Turner <[hidden email]> wrote:

>
>
> I am writing a function that involves a data frame "X" some columns of
> which have attributes.  I replace X by a data frame with a subset of the
> rows of X:
>
>      X <- X[ok,]
>
> where "ok" is a logical vector.  When I do this the attributes of the
> columns (which I need to retain) are lost (except for the "class" and
> "levels" attributes of columns which are factors).
>
> Is there any sexy way to retain the attributes of the columns?
>
> So far the only approach that I can work out is to extract the
> attributes prior to subsetting and put them back after subsetting.
>
> Like unto:
>
>      SaveAt <- lapply(X,attributes)
>      X <- X[ok,]
>      lX <- lapply(names(X),function(nm,x,Sat){
>                                 attributes(x[[nm]]) <- Sat[[nm]]
>                                 x[[nm]]},x=X,Sat=SaveAt)
>      names(lX) <- names(X)
>      X <- as.data.frame(lX)
>
> This seems to work, but is rather kludgy.  Is there a better way?
>
> Thanks for any pointers.
>
> cheers,
>
> Rolf Turner
>
> --
> Honorary Research Fellow
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Retaining attributes of columns of a data frame when subsetting.

Rolf Turner
On 20/10/19 3:00 PM, Richard M. Heiberger wrote:
> Look at
> methods(as.data.frame)
> Define your specialized columns to have a newly defined class, say "myclass".
> Then write as.data.frame.myclass
> It will be similar to the function you already have in the lapply statement.
> Now your statement
> X <- X[ok,]
> should work.

Yes.  That idea does indeed look promising.  I'll check it out.
Thanks.

cheers,

Rolf

--
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Retaining attributes of columns of a data frame when subsetting.

Rui Barradas
Hello,

Richard's idea is good but shouldn't it be `[.myclass` instead?


`[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else
length(cols) == 1){
   SaveAt <- lapply(X, attributes)
   X <- NextMethod()
   lX <- lapply(names(X),function(nm, x, Sat){
     attributes(x[[nm]]) <- Sat[[nm]]
     x[[nm]]}, x = X, Sat = SaveAt)
   names(lX) <- names(X)
   X <- as.data.frame(lX)
   X
}

X <- data.frame(a = letters[1:5], x = 1:5)
class(X) <- c("myclass", class(X))
attr(X$a, "attr_a1") <- "first_a"
attr(X$a, "attr_a2") <- "second_a"
str(X)

ok <- c(1, 3, 4)
X <- X[ok, ]
str(X)


Hope this helps,

Rui Barradas

Às 03:13 de 20/10/19, Rolf Turner escreveu:

> On 20/10/19 3:00 PM, Richard M. Heiberger wrote:
>> Look at
>> methods(as.data.frame)
>> Define your specialized columns to have a newly defined class, say
>> "myclass".
>> Then write as.data.frame.myclass
>> It will be similar to the function you already have in the lapply
>> statement.
>> Now your statement
>> X <- X[ok,]
>> should work.
>
> Yes.  That idea does indeed look promising.  I'll check it out.
> Thanks.
>
> cheers,
>
> Rolf
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Retaining attributes of columns of a data frame when subsetting.

Rolf Turner

On 21/10/19 1:15 AM, Rui Barradas wrote:

> Hello,
>
> Richard's idea is good but shouldn't it be `[.myclass` instead?

Yes, I kind of thought that, and cobbled together something on that
basis that seemed to work.  However my code was rather a hodge-podge.  I
kept having to work around errors that I didn't understand.

You seem to have a much clearer understanding of what's going on, and
your code is much cleaner than what I came up with.

>
>
> `[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else
>                    length(cols) == 1){
>    SaveAt <- lapply(X, attributes)
>    X <- NextMethod()
>    lX <- lapply(names(X),function(nm, x, Sat){
>      attributes(x[[nm]]) <- Sat[[nm]]
>      x[[nm]]}, x = X, Sat = SaveAt)
>    names(lX) <- names(X)
>    X <- as.data.frame(lX)
>    X
> }

But in the foregoing there seems to me to be some inconsistency in the
use of "X" and "x".

Should not the first function argument be "X" rather than "x"?  Or
perhaps the "X" symbols in the code should be replaced by "x"?  As in:

     SaveAt <- lapply(x, attributes)
     x <- NextMethod()
     ....
     ....

Or am I misunderstanding what's going on (as is so often
the case! :-( )?

> X <- data.frame(a = letters[1:5], x = 1:5)
> class(X) <- c("myclass", class(X))
> attr(X$a, "attr_a1") <- "first_a"
> attr(X$a, "attr_a2") <- "second_a"
> str(X)
>
> ok <- c(1, 3, 4)
> X <- X[ok, ]
> str(X)
>
>
> Hope this helps,

Quite a lot!  But I'd appreciate clarification w.r.t. the misgiving that
I expressed above.

Thanks.

cheers,

Rolf

--
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Retaining attributes of columns of a data frame when subsetting.

Rui Barradas
Hello,

Sorry, you're right, in the method it's x, X is the test dataframe.
Repost:

`[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else
length(cols) == 1){
   SaveAt <- lapply(x, attributes)
   x <- NextMethod()
   lX <- lapply(names(x),function(nm, x, Sat){
     attributes(x[[nm]]) <- Sat[[nm]]
     x[[nm]]}, x = x, Sat = SaveAt)
   names(lX) <- names(x)
   x <- as.data.frame(lX)
   x
}


The (frequent) error comes from tests where a X was created in the
globalenv and found by the method.

Rui Barradas

Às 22:55 de 20/10/19, Rolf Turner escreveu:

>
> On 21/10/19 1:15 AM, Rui Barradas wrote:
>
>> Hello,
>>
>> Richard's idea is good but shouldn't it be `[.myclass` instead?
>
> Yes, I kind of thought that, and cobbled together something on that
> basis that seemed to work.  However my code was rather a hodge-podge.  I
> kept having to work around errors that I didn't understand.
>
> You seem to have a much clearer understanding of what's going on, and
> your code is much cleaner than what I came up with.
>
>>
>>
>> `[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else
>>                    length(cols) == 1){
>>    SaveAt <- lapply(X, attributes)
>>    X <- NextMethod()
>>    lX <- lapply(names(X),function(nm, x, Sat){
>>      attributes(x[[nm]]) <- Sat[[nm]]
>>      x[[nm]]}, x = X, Sat = SaveAt)
>>    names(lX) <- names(X)
>>    X <- as.data.frame(lX)
>>    X
>> }
>
> But in the foregoing there seems to me to be some inconsistency in the
> use of "X" and "x".
>
> Should not the first function argument be "X" rather than "x"?  Or
> perhaps the "X" symbols in the code should be replaced by "x"?  As in:
>
>      SaveAt <- lapply(x, attributes)
>      x <- NextMethod()
>      ....
>      ....
>
> Or am I misunderstanding what's going on (as is so often
> the case! :-( )?
>
>> X <- data.frame(a = letters[1:5], x = 1:5)
>> class(X) <- c("myclass", class(X))
>> attr(X$a, "attr_a1") <- "first_a"
>> attr(X$a, "attr_a2") <- "second_a"
>> str(X)
>>
>> ok <- c(1, 3, 4)
>> X <- X[ok, ]
>> str(X)
>>
>>
>> Hope this helps,
>
> Quite a lot!  But I'd appreciate clarification w.r.t. the misgiving that
> I expressed above.
>
> Thanks.
>
> cheers,
>
> Rolf
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Retaining attributes of columns of a data frame when subsetting.

Rolf Turner

On 21/10/19 11:07 AM, Rui Barradas wrote:

> Hello,
>
> Sorry, you're right, in the method it's x, X is the test dataframe.
> Repost:
>
> `[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else
> length(cols) == 1){
>    SaveAt <- lapply(x, attributes)
>    x <- NextMethod()
>    lX <- lapply(names(x),function(nm, x, Sat){
>      attributes(x[[nm]]) <- Sat[[nm]]
>      x[[nm]]}, x = x, Sat = SaveAt)
>    names(lX) <- names(x)
>    x <- as.data.frame(lX)
>    x
> }
>
>
> The (frequent) error comes from tests where a X was created in the
> globalenv and found by the method.

Yep!  Happens to me all the time! :-)

Thanks very much.

cheers,

Rolf

--
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.