Odd behaviour in within.list() when deleting 2+ variables

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Odd behaviour in within.list() when deleting 2+ variables

R devel mailing list
The behaviour of within() with list input changes if you delete 2 or more variables, compared to deleting one:

l <- list(x=1, y=2, z=3)

within(l,
{
    rm(z)
})
#$x
#[1] 1
#
#$y
#[1] 2


within(l, {
    rm(y)
    rm(z)
})
#$x
#[1] 1
#
#$y
#NULL
#
#$z
#NULL


When 2 or more variables are deleted, the list entries are instead set to NULL. Is this intended?

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Odd behaviour in within.list() when deleting 2+ variables

Peter Dalgaard-2
This seems to be due to changes made by Martin Maechler in 2008. Presumably this fixed something, but it escapes my memory.

However, it seems to have broken the equivalence between within.list and within.data.frame, so now

within.list <- within.data.frame

does not suffice.

The crux of the matter seems to be that both the following constructions work for data frames

> aq <- head(airquality)
> names(aq)
[1] "Ozone"   "Solar.R" "Wind"    "Temp"    "Month"   "Day"    
> aq[c("Wind","Temp")] <- NULL
> aq
  Ozone Solar.R Month Day
1    41     190     5   1
2    36     118     5   2
3    12     149     5   3
4    18     313     5   4
5    NA      NA     5   5
6    28      NA     5   6
> aq <- head(airquality)
> aq[c("Wind","Temp")] <- vector("list",2)
> aq
  Ozone Solar.R Month Day
1    41     190     5   1
2    36     118     5   2
3    12     149     5   3
4    18     313     5   4
5    NA      NA     5   5
6    28      NA     5   6

However, for lists they differ:

> aq <- as.list(head(airquality))
> aq[c("Wind","Temp")] <- vector("list",2)
> aq
$Ozone
[1] 41 36 12 18 NA 28

$Solar.R
[1] 190 118 149 313  NA  NA

$Wind
NULL

$Temp
NULL

$Month
[1] 5 5 5 5 5 5

$Day
[1] 1 2 3 4 5 6

> aq <- as.list(head(airquality))
> aq[c("Wind","Temp")] <- NULL
> aq
$Ozone
[1] 41 36 12 18 NA 28

$Solar.R
[1] 190 118 149 313  NA  NA

$Month
[1] 5 5 5 5 5 5

$Day
[1] 1 2 3 4 5 6


-pd

> On 26 Jun 2017, at 04:40 , Hong Ooi via R-devel <[hidden email]> wrote:
>
> The behaviour of within() with list input changes if you delete 2 or more variables, compared to deleting one:
>
> l <- list(x=1, y=2, z=3)
>
> within(l,
> {
>    rm(z)
> })
> #$x
> #[1] 1
> #
> #$y
> #[1] 2
>
>
> within(l, {
>    rm(y)
>    rm(z)
> })
> #$x
> #[1] 1
> #
> #$y
> #NULL
> #
> #$z
> #NULL
>
>
> When 2 or more variables are deleted, the list entries are instead set to NULL. Is this intended?
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Odd behaviour in within.list() when deleting 2+ variables

Martin Maechler
>>>>> peter dalgaard <[hidden email]>
>>>>>     on Mon, 26 Jun 2017 13:43:28 +0200 writes:

    > This seems to be due to changes made by Martin Maechler in
    > 2008. Presumably this fixed something, but it escapes my
    > memory.

Yes: The change set (svn -c46441) also contains the following NEWS entry

 BUG FIXES
 
     o within(<dataframe>, { ... }) now also works when '...' removes
      more than one column.


    > However, it seems to have broken the equivalence
    > between within.list and within.data.frame, so now

    >   within.list <- within.data.frame

    > does not suffice.

There have been many improvements since then, so maybe we can
change the code so that the above will work again.

Another problem seems that we had no tests of  within.list()
anywhere... so we will have them now.

I've hade an idea that seems to work and even simplify the
code....  will get back to the issue later in the evening.

Martin


    > The crux of the matter seems to be that both the following
    > constructions work for data frames

    >> aq <- head(airquality)
    >> names(aq)
    > [1] "Ozone"   "Solar.R" "Wind"    "Temp"    "Month"   "Day"    
    >> aq[c("Wind","Temp")] <- NULL
    >> aq
    > Ozone Solar.R Month Day
    > 1    41     190     5   1
    > 2    36     118     5   2
    > 3    12     149     5   3
    > 4    18     313     5   4
    > 5    NA      NA     5   5
    > 6    28      NA     5   6
    >> aq <- head(airquality)
    >> aq[c("Wind","Temp")] <- vector("list",2)
    >> aq
    > Ozone Solar.R Month Day
    > 1    41     190     5   1
    > 2    36     118     5   2
    > 3    12     149     5   3
    > 4    18     313     5   4
    > 5    NA      NA     5   5
    > 6    28      NA     5   6

    > However, for lists they differ:

    >> aq <- as.list(head(airquality))
    >> aq[c("Wind","Temp")] <- vector("list",2)
    >> aq
    > $Ozone
    > [1] 41 36 12 18 NA 28

    > $Solar.R
    > [1] 190 118 149 313  NA  NA

    > $Wind
    > NULL

    > $Temp
    > NULL

    > $Month
    > [1] 5 5 5 5 5 5

    > $Day
    > [1] 1 2 3 4 5 6

    >> aq <- as.list(head(airquality))
    >> aq[c("Wind","Temp")] <- NULL
    >> aq
    > $Ozone
    > [1] 41 36 12 18 NA 28

    > $Solar.R
    > [1] 190 118 149 313  NA  NA

    > $Month
    > [1] 5 5 5 5 5 5

    > $Day
    > [1] 1 2 3 4 5 6


    > -pd

    >> On 26 Jun 2017, at 04:40 , Hong Ooi via R-devel <[hidden email]> wrote:
    >>
    >> The behaviour of within() with list input changes if you delete 2 or more variables, compared to deleting one:
    >>
    >> l <- list(x=1, y=2, z=3)
    >>
    >> within(l,
    >> {
    >> rm(z)
    >> })
    >> #$x
    >> #[1] 1
    >> #
    >> #$y
    >> #[1] 2
    >>
    >>
    >> within(l, {
    >> rm(y)
    >> rm(z)
    >> })
    >> #$x
    >> #[1] 1
    >> #
    >> #$y
    >> #NULL
    >> #
    >> #$z
    >> #NULL
    >>
    >>
    >> When 2 or more variables are deleted, the list entries are instead set to NULL. Is this intended?
    >>
    >> ______________________________________________
    >> [hidden email] mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel

    > --
    > Peter Dalgaard, Professor,
    > Center for Statistics, Copenhagen Business School
    > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    > Phone: (+45)38153501
    > Office: A 4.23
    > Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Odd behaviour in within.list() when deleting 2+ variables

Peter Dalgaard-2

> On 26 Jun 2017, at 19:04 , Martin Maechler <[hidden email]> wrote:
>
>>>>>> peter dalgaard <[hidden email]>
>>>>>>    on Mon, 26 Jun 2017 13:43:28 +0200 writes:
>
>> This seems to be due to changes made by Martin Maechler in
>> 2008. Presumably this fixed something, but it escapes my
>> memory.
>
> Yes: The change set (svn -c46441) also contains the following NEWS entry
>
> BUG FIXES
>
>     o within(<dataframe>, { ... }) now also works when '...' removes
>     more than one column.
>

The odd thing is that the assign-NULL technique used for removing a single column, NOW also seems to work for several columns in a data frame, so I wonder what the bug was back then...

-pd


>
>> However, it seems to have broken the equivalence
>> between within.list and within.data.frame, so now
>
>>  within.list <- within.data.frame
>
>> does not suffice.
>
> There have been many improvements since then, so maybe we can
> change the code so that the above will work again.
>
> Another problem seems that we had no tests of  within.list()
> anywhere... so we will have them now.
>
> I've hade an idea that seems to work and even simplify the
> code....  will get back to the issue later in the evening.
>
> Martin
>
>
>> The crux of the matter seems to be that both the following
>> constructions work for data frames
>
>>> aq <- head(airquality)
>>> names(aq)
>> [1] "Ozone"   "Solar.R" "Wind"    "Temp"    "Month"   "Day"    
>>> aq[c("Wind","Temp")] <- NULL
>>> aq
>> Ozone Solar.R Month Day
>> 1    41     190     5   1
>> 2    36     118     5   2
>> 3    12     149     5   3
>> 4    18     313     5   4
>> 5    NA      NA     5   5
>> 6    28      NA     5   6
>>> aq <- head(airquality)
>>> aq[c("Wind","Temp")] <- vector("list",2)
>>> aq
>> Ozone Solar.R Month Day
>> 1    41     190     5   1
>> 2    36     118     5   2
>> 3    12     149     5   3
>> 4    18     313     5   4
>> 5    NA      NA     5   5
>> 6    28      NA     5   6
>
>> However, for lists they differ:
>
>>> aq <- as.list(head(airquality))
>>> aq[c("Wind","Temp")] <- vector("list",2)
>>> aq
>> $Ozone
>> [1] 41 36 12 18 NA 28
>
>> $Solar.R
>> [1] 190 118 149 313  NA  NA
>
>> $Wind
>> NULL
>
>> $Temp
>> NULL
>
>> $Month
>> [1] 5 5 5 5 5 5
>
>> $Day
>> [1] 1 2 3 4 5 6
>
>>> aq <- as.list(head(airquality))
>>> aq[c("Wind","Temp")] <- NULL
>>> aq
>> $Ozone
>> [1] 41 36 12 18 NA 28
>
>> $Solar.R
>> [1] 190 118 149 313  NA  NA
>
>> $Month
>> [1] 5 5 5 5 5 5
>
>> $Day
>> [1] 1 2 3 4 5 6
>
>
>> -pd
>
>>> On 26 Jun 2017, at 04:40 , Hong Ooi via R-devel <[hidden email]> wrote:
>>>
>>> The behaviour of within() with list input changes if you delete 2 or more variables, compared to deleting one:
>>>
>>> l <- list(x=1, y=2, z=3)
>>>
>>> within(l,
>>> {
>>> rm(z)
>>> })
>>> #$x
>>> #[1] 1
>>> #
>>> #$y
>>> #[1] 2
>>>
>>>
>>> within(l, {
>>> rm(y)
>>> rm(z)
>>> })
>>> #$x
>>> #[1] 1
>>> #
>>> #$y
>>> #NULL
>>> #
>>> #$z
>>> #NULL
>>>
>>>
>>> When 2 or more variables are deleted, the list entries are instead set to NULL. Is this intended?
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>> --
>> Peter Dalgaard, Professor,
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Office: A 4.23
>> Email: [hidden email]  Priv: [hidden email]
>
>
>
>
>
>
>
>

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Odd behaviour in within.list() when deleting 2+ variables

Martin Maechler
>>>>> "PD" == Peter Dalgaard <[hidden email]>
>>>>>     on Mon, 26 Jun 2017 20:12:38 +0200 writes:

    >> On 26 Jun 2017, at 19:04 , Martin Maechler
    >> <[hidden email]> wrote:
    >>
    >>>>>>> peter dalgaard <[hidden email]> on Mon, 26 Jun
    >>>>>>> 2017 13:43:28 +0200 writes:
    >>
    >>> This seems to be due to changes made by Martin Maechler
    >>> in 2008. Presumably this fixed something, but it escapes
    >>> my memory.
    >>
    >> Yes: The change set (svn -c46441) also contains the
    >> following NEWS entry
    >>
    >> BUG FIXES
    >>
    >> o within(<dataframe>, { ... }) now also works when '...'
    >> removes more than one column.
    >>

    > The odd thing is that the assign-NULL technique used for
    > removing a single column, NOW also seems to work for
    > several columns in a data frame, so I wonder what the
    > bug was back then...

It did not work back then:  We have had lots of improvements in
 [.data.frame in these almost 9 years.

Indeed, the fix I've committed reverts almost to the previous
first version of  within.data.frame  (which is from Peter
Dalgaard, for those who don't know).

Martin


    >>> However, it seems to have broken the equivalence between
    >>> within.list and within.data.frame, so now
    >>
    >>> within.list <- within.data.frame
    >>
    >>> does not suffice.
    >>
    >> There have been many improvements since then, so maybe we
    >> can change the code so that the above will work again.
    >>
    >> Another problem seems that we had no tests of
    >> within.list() anywhere... so we will have them now.
    >>
    >> I've hade an idea that seems to work and even simplify
    >> the code....  will get back to the issue later in the
    >> evening.
    >>
    >> Martin
    >>
    >>
    >>> The crux of the matter seems to be that both the
    >>> following constructions work for data frames
    >>
    >>>> aq <- head(airquality) names(aq)
    >>> [1] "Ozone" "Solar.R" "Wind" "Temp" "Month" "Day"
    >>>> aq[c("Wind","Temp")] <- NULL aq
    >>> Ozone Solar.R Month Day 1 41 190 5 1 2 36 118 5 2 3 12
    >>> 149 5 3 4 18 313 5 4 5 NA NA 5 5 6 28 NA 5 6
    >>>> aq <- head(airquality) aq[c("Wind","Temp")] <-
    >>>> vector("list",2) aq
    >>> Ozone Solar.R Month Day 1 41 190 5 1 2 36 118 5 2 3 12
    >>> 149 5 3 4 18 313 5 4 5 NA NA 5 5 6 28 NA 5 6
    >>
    >>> However, for lists they differ:
    >>
    >>>> aq <- as.list(head(airquality)) aq[c("Wind","Temp")] <-
    >>>> vector("list",2) aq
    >>> $Ozone [1] 41 36 12 18 NA 28
    >>
    >>> $Solar.R [1] 190 118 149 313 NA NA
    >>
    >>> $Wind NULL
    >>
    >>> $Temp NULL
    >>
    >>> $Month [1] 5 5 5 5 5 5
    >>
    >>> $Day [1] 1 2 3 4 5 6
    >>
    >>>> aq <- as.list(head(airquality)) aq[c("Wind","Temp")] <-
    >>>> NULL aq
    >>> $Ozone [1] 41 36 12 18 NA 28
    >>
    >>> $Solar.R [1] 190 118 149 313 NA NA
    >>
    >>> $Month [1] 5 5 5 5 5 5
    >>
    >>> $Day [1] 1 2 3 4 5 6
    >>
    >>
    >>> -pd
    >>
    >>>> On 26 Jun 2017, at 04:40 , Hong Ooi via R-devel
    >>>> <[hidden email]> wrote:
    >>>>
    >>>> The behaviour of within() with list input changes if
    >>>> you delete 2 or more variables, compared to deleting
    >>>> one:
    >>>>
    >>>> l <- list(x=1, y=2, z=3)
    >>>>
    >>>> within(l, { rm(z) }) #$x #[1] 1
    >>>> #
    >>>> #$y #[1] 2
    >>>>
    >>>>
    >>>> within(l, { rm(y) rm(z) }) #$x #[1] 1
    >>>> #
    >>>> #$y #NULL
    >>>> #
    >>>> #$z #NULL
    >>>>
    >>>>
    >>>> When 2 or more variables are deleted, the list entries
    >>>> are instead set to NULL. Is this intended?
    >>>>
    >>>> ______________________________________________
    >>>> [hidden email] mailing list
    >>>> https://stat.ethz.ch/mailman/listinfo/r-devel
    >>
    >>> --
    >>> Peter Dalgaard, Professor, Center for Statistics,
    >>> Copenhagen Business School Solbjerg Plads 3, 2000
    >>> Frederiksberg, Denmark Phone: (+45)38153501 Office: A
    >>> 4.23 Email: [hidden email] Priv: [hidden email]
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >>

    PD> -- Peter Dalgaard, Professor, Center for Statistics,
    PD> Copenhagen Business School Solbjerg Plads 3, 2000
    PD> Frederiksberg, Denmark Phone: (+45)38153501 Office: A
    PD> 4.23 Email: [hidden email] Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Odd behaviour in within.list() when deleting 2+ variables

Peter Dalgaard-2

> On 26 Jun 2017, at 21:56 , Martin Maechler <[hidden email]> wrote:
>
>
> Indeed, the fix I've committed reverts almost to the previous
> first version of  within.data.frame  (which is from Peter
> Dalgaard, for those who don't know).
>

Great foresight on my part there, eh? ;-)

-p

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Loading...