model.frame strips class as promised, but fails to strip OBJECT in C

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

model.frame strips class as promised, but fails to strip OBJECT in C

Michael Chirico
Full thread here:

https://github.com/tidyverse/broom/issues/287

Reproducible example:

is.object(freeny$y)
# [1] TRUE
attr(freeny$y, 'class')
# [1] "ts"
class(freeny$y)
# [1] "ts"

# ts attribute wiped by model.frame
class(model.frame(y ~ ., data = freeny)$y)
# [1] "numeric"
attr(model.frame(y ~ ., data = freeny)$y, 'class')
# NULL

# but still:
is.object(model.frame(y ~ ., data = freeny)$y)
# [1] TRUE

That is, given a numeric vector with class "ts", model.frame strips the
"ts" attribute but keeps the is.object property.

This behavior is alluded to in ?model.frame:

Unless na.action = NULL, time-series attributes will be removed from the
> variables found (since they will be wrong if NAs are removed).
>

And in fact explicitly setting na.action = NULL prevents dropping the class:

class(model.frame(y ~ ., data = freeny, na.action = NULL)$y)
# [1] "ts"

The reason this looks especially like a bug is that it differs from how
na.omit behaves:

DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA))
is.object(DF$y)
# [1] FALSE
class(DF$y) = 'foo'
is.object(DF$y)
# [1] TRUE
class(na.omit(DF)$y)
# [1] "numeric"
is.object(na.omit(DF)$y)
# [1] FALSE


That is, similarly presented with a classed object, na.omit strips the
class *and* the OBJECT attribute.

Thanks,
Michael Chirico

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: model.frame strips class as promised, but fails to strip OBJECT in C

Joris FA Meys
Given the documentation in ?is.object and the info in R Internals section
1.1.2, I'd argue that this indeed a bug.

Looking at
https://github.com/wch/r-source/blob/trunk/src/library/stats/src/model.c
(line 220 and following) the function copyMostAttribNoTs is called to copy
the attributes lost after the na.action. This decision makes sense, but
when I look at that functiion copyMostAttribNoTs it sets the object bit to
the original state of the input, whereas in the case of a ts object, both
the class and the tsp attribute were dropped and not restored. More
specifically, line 338 of
https://github.com/wch/r-source/blob/3d0650af456d97648c72de66a40b85d3ec96a497/src/main/attrib.c
is imho the place where the object bit is set on a previous ts object that
lost its attributes, and if I read R Internals correctly, that shouldn't
happen.

Cheers
Joris

On Mon, Mar 5, 2018 at 4:59 PM, Michael Chirico <[hidden email]>
wrote:

> Full thread here:
>
> https://github.com/tidyverse/broom/issues/287
>
> Reproducible example:
>
> is.object(freeny$y)
> # [1] TRUE
> attr(freeny$y, 'class')
> # [1] "ts"
> class(freeny$y)
> # [1] "ts"
>
> # ts attribute wiped by model.frame
> class(model.frame(y ~ ., data = freeny)$y)
> # [1] "numeric"
> attr(model.frame(y ~ ., data = freeny)$y, 'class')
> # NULL
>
> # but still:
> is.object(model.frame(y ~ ., data = freeny)$y)
> # [1] TRUE
>
> That is, given a numeric vector with class "ts", model.frame strips the
> "ts" attribute but keeps the is.object property.
>
> This behavior is alluded to in ?model.frame:
>
> Unless na.action = NULL, time-series attributes will be removed from the
> > variables found (since they will be wrong if NAs are removed).
> >
>
> And in fact explicitly setting na.action = NULL prevents dropping the
> class:
>
> class(model.frame(y ~ ., data = freeny, na.action = NULL)$y)
> # [1] "ts"
>
> The reason this looks especially like a bug is that it differs from how
> na.omit behaves:
>
> DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA))
> is.object(DF$y)
> # [1] FALSE
> class(DF$y) = 'foo'
> is.object(DF$y)
> # [1] TRUE
> class(na.omit(DF)$y)
> # [1] "numeric"
> is.object(na.omit(DF)$y)
> # [1] FALSE
>
>
> That is, similarly presented with a classed object, na.omit strips the
> class *and* the OBJECT attribute.
>
> Thanks,
> Michael Chirico
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



--
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)
<https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>

-----------
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel