Quantcast

Confused about NAMED

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Confused about NAMED

Matthew Dowle
Hi,

I expected NAMED to be 1 in all these three cases. It is for one of them,
but not the other two?

> R --vanilla
R version 2.14.0 (2011-10-31)
Platform: i386-pc-mingw32/i386 (32-bit)

> x = 1L
> .Internal(inspect(x))   # why NAM(2)? expected NAM(1)
@2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1

> y = 1:10
> .Internal(inspect(y))   # NAM(1) as expected but why different to x?
@272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...

> z = data.frame()
> .Internal(inspect(z))   # why NAM(2)? expected NAM(1)
@24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0)
ATTRIB:
  @24fc270 02 LISTSXP g0c0 []
    TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
    @24fc334 16 STRSXP g0c0 [] (len=0, tl=0)
    TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
    @24fc318 13 INTSXP g0c0 [] (len=0, tl=0)
    TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
    @25be500 16 STRSXP g0c1 [] (len=1, tl=0)
      @1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"

It's a little difficult to search for the word "named" but I tried and
found this in R-ints :

    "Note that optimizing NAMED = 1 is only effective within a primitive
(as the closure wrapper of a .Internal will set NAMED = 2 when the
promise to the argument is evaluated)"

So might it be that just looking at NAMED using .Internal(inspect()) is
setting NAMED=2?  But if so, why does y have NAMED==1?

Thanks!
Matthew

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Confused about NAMED

Peter Dalgaard-2

On Nov 24, 2011, at 11:13 , Matthew Dowle wrote:

> Hi,
>
> I expected NAMED to be 1 in all these three cases. It is for one of them,
> but not the other two?
>
>> R --vanilla
> R version 2.14.0 (2011-10-31)
> Platform: i386-pc-mingw32/i386 (32-bit)
>
>> x = 1L
>> .Internal(inspect(x))   # why NAM(2)? expected NAM(1)
> @2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1
>
>> y = 1:10
>> .Internal(inspect(y))   # NAM(1) as expected but why different to x?
> @272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
>
>> z = data.frame()
>> .Internal(inspect(z))   # why NAM(2)? expected NAM(1)
> @24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0)
> ATTRIB:
>  @24fc270 02 LISTSXP g0c0 []
>    TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>    @24fc334 16 STRSXP g0c0 [] (len=0, tl=0)
>    TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
>    @24fc318 13 INTSXP g0c0 [] (len=0, tl=0)
>    TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
>    @25be500 16 STRSXP g0c1 [] (len=1, tl=0)
>      @1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
>
> It's a little difficult to search for the word "named" but I tried and
> found this in R-ints :
>
>    "Note that optimizing NAMED = 1 is only effective within a primitive
> (as the closure wrapper of a .Internal will set NAMED = 2 when the
> promise to the argument is evaluated)"
>
> So might it be that just looking at NAMED using .Internal(inspect()) is
> setting NAMED=2?  But if so, why does y have NAMED==1?

This is tricky business... I'm not quite sure I'll get it right, but let's try

When you are assigning a constant, the value you assign is already part of the assignment expression, so if you want to modify it, you must duplicate. So NAMED==2 on z <- 1 is basically to prevent you from accidentally "changing the value of 1". If it weren't, then you could get bitten by code like for(i in 1:2) {z <- 1; if(i==1) z[1] <- 2}.

If you're assigning the result of a computation, then the object only exists once, so
z <- 0+1  gets NAMED==1.

However, if the computation is done by returning a named value from within a function, as in

> f <- function(){v <- 1+0; v}
> z <- f()

then again NAMED==2. This is because the side effects of the function _might_ result in something having a hold on the function environment, e.g. if we had

e <- NULL
f <- function(){e <<-environment(); v <- 1+0; v}
z <- f()

then z[1] <- 5 would change e$v too. As it happens, there aren't any side effects in the forme case, but R loses track and assumes the worst.


>
> Thanks!
> Matthew
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Confused about NAMED

Matthew Dowle
>
> On Nov 24, 2011, at 11:13 , Matthew Dowle wrote:
>
>> Hi,
>>
>> I expected NAMED to be 1 in all these three cases. It is for one of
>> them,
>> but not the other two?
>>
>>> R --vanilla
>> R version 2.14.0 (2011-10-31)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>>> x = 1L
>>> .Internal(inspect(x))   # why NAM(2)? expected NAM(1)
>> @2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1
>>
>>> y = 1:10
>>> .Internal(inspect(y))   # NAM(1) as expected but why different to x?
>> @272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
>>
>>> z = data.frame()
>>> .Internal(inspect(z))   # why NAM(2)? expected NAM(1)
>> @24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0)
>> ATTRIB:
>>  @24fc270 02 LISTSXP g0c0 []
>>    TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>    @24fc334 16 STRSXP g0c0 [] (len=0, tl=0)
>>    TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
>>    @24fc318 13 INTSXP g0c0 [] (len=0, tl=0)
>>    TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
>>    @25be500 16 STRSXP g0c1 [] (len=1, tl=0)
>>      @1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
>>
>> It's a little difficult to search for the word "named" but I tried and
>> found this in R-ints :
>>
>>    "Note that optimizing NAMED = 1 is only effective within a primitive
>> (as the closure wrapper of a .Internal will set NAMED = 2 when the
>> promise to the argument is evaluated)"
>>
>> So might it be that just looking at NAMED using .Internal(inspect()) is
>> setting NAMED=2?  But if so, why does y have NAMED==1?
>
> This is tricky business... I'm not quite sure I'll get it right, but let's
> try
>
> When you are assigning a constant, the value you assign is already part of
> the assignment expression, so if you want to modify it, you must
> duplicate. So NAMED==2 on z <- 1 is basically to prevent you from
> accidentally "changing the value of 1". If it weren't, then you could get
> bitten by code like for(i in 1:2) {z <- 1; if(i==1) z[1] <- 2}.
>
> If you're assigning the result of a computation, then the object only
> exists once, so
> z <- 0+1  gets NAMED==1.
>
> However, if the computation is done by returning a named value from within
> a function, as in
>
>> f <- function(){v <- 1+0; v}
>> z <- f()
>
> then again NAMED==2. This is because the side effects of the function
> _might_ result in something having a hold on the function environment,
> e.g. if we had
>
> e <- NULL
> f <- function(){e <<-environment(); v <- 1+0; v}
> z <- f()
>
> then z[1] <- 5 would change e$v too. As it happens, there aren't any side
> effects in the forme case, but R loses track and assumes the worst.
>

Thanks a lot, think I follow. That explains x vs y, but why is z NAMED==2?
The result of data.frame() is an object that exists once (similar to 1:10)
so shouldn't it be NAMED==1 too?  Or, R loses track and assumes the worst
even on its own functions such as data.frame()?

>>
>> Thanks!
>> Matthew
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> --
> Peter Dalgaard, Professor
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: [hidden email]  Priv: [hidden email]
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Confused about NAMED

Duncan Murdoch-2
On 11-11-24 6:34 AM, Matthew Dowle wrote:

>>
>> On Nov 24, 2011, at 11:13 , Matthew Dowle wrote:
>>
>>> Hi,
>>>
>>> I expected NAMED to be 1 in all these three cases. It is for one of
>>> them,
>>> but not the other two?
>>>
>>>> R --vanilla
>>> R version 2.14.0 (2011-10-31)
>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>>
>>>> x = 1L
>>>> .Internal(inspect(x))   # why NAM(2)? expected NAM(1)
>>> @2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1
>>>
>>>> y = 1:10
>>>> .Internal(inspect(y))   # NAM(1) as expected but why different to x?
>>> @272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
>>>
>>>> z = data.frame()
>>>> .Internal(inspect(z))   # why NAM(2)? expected NAM(1)
>>> @24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0)
>>> ATTRIB:
>>>   @24fc270 02 LISTSXP g0c0 []
>>>     TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>>     @24fc334 16 STRSXP g0c0 [] (len=0, tl=0)
>>>     TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
>>>     @24fc318 13 INTSXP g0c0 [] (len=0, tl=0)
>>>     TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
>>>     @25be500 16 STRSXP g0c1 [] (len=1, tl=0)
>>>       @1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
>>>
>>> It's a little difficult to search for the word "named" but I tried and
>>> found this in R-ints :
>>>
>>>     "Note that optimizing NAMED = 1 is only effective within a primitive
>>> (as the closure wrapper of a .Internal will set NAMED = 2 when the
>>> promise to the argument is evaluated)"
>>>
>>> So might it be that just looking at NAMED using .Internal(inspect()) is
>>> setting NAMED=2?  But if so, why does y have NAMED==1?
>>
>> This is tricky business... I'm not quite sure I'll get it right, but let's
>> try
>>
>> When you are assigning a constant, the value you assign is already part of
>> the assignment expression, so if you want to modify it, you must
>> duplicate. So NAMED==2 on z<- 1 is basically to prevent you from
>> accidentally "changing the value of 1". If it weren't, then you could get
>> bitten by code like for(i in 1:2) {z<- 1; if(i==1) z[1]<- 2}.
>>
>> If you're assigning the result of a computation, then the object only
>> exists once, so
>> z<- 0+1  gets NAMED==1.
>>
>> However, if the computation is done by returning a named value from within
>> a function, as in
>>
>>> f<- function(){v<- 1+0; v}
>>> z<- f()
>>
>> then again NAMED==2. This is because the side effects of the function
>> _might_ result in something having a hold on the function environment,
>> e.g. if we had
>>
>> e<- NULL
>> f<- function(){e<<-environment(); v<- 1+0; v}
>> z<- f()
>>
>> then z[1]<- 5 would change e$v too. As it happens, there aren't any side
>> effects in the forme case, but R loses track and assumes the worst.
>>
>
> Thanks a lot, think I follow. That explains x vs y, but why is z NAMED==2?
> The result of data.frame() is an object that exists once (similar to 1:10)
> so shouldn't it be NAMED==1 too?  Or, R loses track and assumes the worst
> even on its own functions such as data.frame()?

R has several types of functions -- see the R Internals manual for
details.  data.frame() is a plain R function, so it is treated no
differently than any user-written function.  On the other hand, the
internal function that implements the : operator is a "primitive", so it
has complete control over its return value, and it can set NAMED in the
most efficient way.

So you might think that returning a value as an evaluation of a
primitive adds efficiency, e.g. in Peter's example

f<- function(){v<- 1+0; v + 0}

will return NAMED == 1.  But that's because internally it had to make a
copy of v before adding 0 to it, so you've probably really made it less
efficient:  the original version might never modify the result, so it
might never make a copy.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Confused about NAMED

Peter Dalgaard-2
In reply to this post by Matthew Dowle

On Nov 24, 2011, at 12:34 , Matthew Dowle wrote:

>>
>> On Nov 24, 2011, at 11:13 , Matthew Dowle wrote:
>>
>>> Hi,
>>>
>>> I expected NAMED to be 1 in all these three cases. It is for one of
>>> them,
>>> but not the other two?
>>>
>>>> R --vanilla
>>> R version 2.14.0 (2011-10-31)
>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>>
>>>> x = 1L
>>>> .Internal(inspect(x))   # why NAM(2)? expected NAM(1)
>>> @2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1
>>>
>>>> y = 1:10
>>>> .Internal(inspect(y))   # NAM(1) as expected but why different to x?
>>> @272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
>>>
>>>> z = data.frame()
>>>> .Internal(inspect(z))   # why NAM(2)? expected NAM(1)
>>> @24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0)
>>> ATTRIB:
>>> @24fc270 02 LISTSXP g0c0 []
>>>   TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>>   @24fc334 16 STRSXP g0c0 [] (len=0, tl=0)
>>>   TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
>>>   @24fc318 13 INTSXP g0c0 [] (len=0, tl=0)
>>>   TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
>>>   @25be500 16 STRSXP g0c1 [] (len=1, tl=0)
>>>     @1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
>>>
>>> It's a little difficult to search for the word "named" but I tried and
>>> found this in R-ints :
>>>
>>>   "Note that optimizing NAMED = 1 is only effective within a primitive
>>> (as the closure wrapper of a .Internal will set NAMED = 2 when the
>>> promise to the argument is evaluated)"
>>>
>>> So might it be that just looking at NAMED using .Internal(inspect()) is
>>> setting NAMED=2?  But if so, why does y have NAMED==1?
>>
>> This is tricky business... I'm not quite sure I'll get it right, but let's
>> try
>>
>> When you are assigning a constant, the value you assign is already part of
>> the assignment expression, so if you want to modify it, you must
>> duplicate. So NAMED==2 on z <- 1 is basically to prevent you from
>> accidentally "changing the value of 1". If it weren't, then you could get
>> bitten by code like for(i in 1:2) {z <- 1; if(i==1) z[1] <- 2}.
>>
>> If you're assigning the result of a computation, then the object only
>> exists once, so
>> z <- 0+1  gets NAMED==1.
>>
>> However, if the computation is done by returning a named value from within
>> a function, as in
>>
>>> f <- function(){v <- 1+0; v}
>>> z <- f()
>>
>> then again NAMED==2. This is because the side effects of the function
>> _might_ result in something having a hold on the function environment,
>> e.g. if we had
>>
>> e <- NULL
>> f <- function(){e <<-environment(); v <- 1+0; v}
>> z <- f()
>>
>> then z[1] <- 5 would change e$v too. As it happens, there aren't any side
>> effects in the forme case, but R loses track and assumes the worst.
>>
>
> Thanks a lot, think I follow. That explains x vs y, but why is z NAMED==2?
> The result of data.frame() is an object that exists once (similar to 1:10)
> so shouldn't it be NAMED==1 too?  Or, R loses track and assumes the worst
> even on its own functions such as data.frame()?

R loses track. I suspect that is really all it can do without actual reference counting. The function data.frame is more than 150 lines of code, and if any of those end up invoking user code, possibly via a class method, you can't tell definitively whether or not the evaluation environment dies at the return.

--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Confused about NAMED

Matthew Dowle
>
> On Nov 24, 2011, at 12:34 , Matthew Dowle wrote:
>
>>>
>>> On Nov 24, 2011, at 11:13 , Matthew Dowle wrote:
>>>
>>>> Hi,
>>>>
>>>> I expected NAMED to be 1 in all these three cases. It is for one of
>>>> them,
>>>> but not the other two?
>>>>
>>>>> R --vanilla
>>>> R version 2.14.0 (2011-10-31)
>>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>>>
>>>>> x = 1L
>>>>> .Internal(inspect(x))   # why NAM(2)? expected NAM(1)
>>>> @2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1
>>>>
>>>>> y = 1:10
>>>>> .Internal(inspect(y))   # NAM(1) as expected but why different to x?
>>>> @272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
>>>>
>>>>> z = data.frame()
>>>>> .Internal(inspect(z))   # why NAM(2)? expected NAM(1)
>>>> @24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0)
>>>> ATTRIB:
>>>> @24fc270 02 LISTSXP g0c0 []
>>>>   TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>>>   @24fc334 16 STRSXP g0c0 [] (len=0, tl=0)
>>>>   TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
>>>>   @24fc318 13 INTSXP g0c0 [] (len=0, tl=0)
>>>>   TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
>>>>   @25be500 16 STRSXP g0c1 [] (len=1, tl=0)
>>>>     @1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
>>>>
>>>> It's a little difficult to search for the word "named" but I tried and
>>>> found this in R-ints :
>>>>
>>>>   "Note that optimizing NAMED = 1 is only effective within a primitive
>>>> (as the closure wrapper of a .Internal will set NAMED = 2 when the
>>>> promise to the argument is evaluated)"
>>>>
>>>> So might it be that just looking at NAMED using .Internal(inspect())
>>>> is
>>>> setting NAMED=2?  But if so, why does y have NAMED==1?
>>>
>>> This is tricky business... I'm not quite sure I'll get it right, but
>>> let's
>>> try
>>>
>>> When you are assigning a constant, the value you assign is already part
>>> of
>>> the assignment expression, so if you want to modify it, you must
>>> duplicate. So NAMED==2 on z <- 1 is basically to prevent you from
>>> accidentally "changing the value of 1". If it weren't, then you could
>>> get
>>> bitten by code like for(i in 1:2) {z <- 1; if(i==1) z[1] <- 2}.
>>>
>>> If you're assigning the result of a computation, then the object only
>>> exists once, so
>>> z <- 0+1  gets NAMED==1.
>>>
>>> However, if the computation is done by returning a named value from
>>> within
>>> a function, as in
>>>
>>>> f <- function(){v <- 1+0; v}
>>>> z <- f()
>>>
>>> then again NAMED==2. This is because the side effects of the function
>>> _might_ result in something having a hold on the function environment,
>>> e.g. if we had
>>>
>>> e <- NULL
>>> f <- function(){e <<-environment(); v <- 1+0; v}
>>> z <- f()
>>>
>>> then z[1] <- 5 would change e$v too. As it happens, there aren't any
>>> side
>>> effects in the forme case, but R loses track and assumes the worst.
>>>
>>
>> Thanks a lot, think I follow. That explains x vs y, but why is z
>> NAMED==2?
>> The result of data.frame() is an object that exists once (similar to
>> 1:10)
>> so shouldn't it be NAMED==1 too?  Or, R loses track and assumes the
>> worst
>> even on its own functions such as data.frame()?
>
> R loses track. I suspect that is really all it can do without actual
> reference counting. The function data.frame is more than 150 lines of
> code, and if any of those end up invoking user code, possibly via a class
> method, you can't tell definitively whether or not the evaluation
> environment dies at the return.

Ohhh, think I see now. After Duncan's reply I was going to ask if it was
possible to change data.frame() to be primitive so it could set NAMED=1.
But it seems primitive functions can't use R code so data.frame() would
need to be ported to C. Ok! - not quick or easy, and not without
consideable risk. And, data.frame() can invoke user code inside it anyway
then.

Since list() is primitive I tried to construct a data.frame starting with
list() [since structure() isn't primitive], but then merely adding an
attribute seems to set NAMED==2 too ?

> DF = list(a=1:3,b=4:6)
> .Internal(inspect(DF))     # so far so good: NAM(1)
@25149e0 19 VECSXP g0c1 [NAM(1),ATT] (len=2, tl=0)
  @263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
  @263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
ATTRIB:
  @2457984 02 LISTSXP g0c0 []
    TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
    @25149c0 16 STRSXP g0c1 [] (len=2, tl=0)
      @1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
      @1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
>
> attr(DF,"foo") <- "bar"    # just adding an attribute sets NAM(2) ?
> .Internal(inspect(DF))
@25149e0 19 VECSXP g0c1 [NAM(2),ATT] (len=2, tl=0)
  @263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
  @263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
ATTRIB:
  @2457984 02 LISTSXP g0c0 []
    TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
    @25149c0 16 STRSXP g0c1 [] (len=2, tl=0)
      @1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
      @1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
    TAG: @245732c 01 SYMSXP g0c0 [] "foo"
    @25148a0 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
      @2514920 09 CHARSXP g0c1 [gp=0x20] "bar"


Matthew


> --
> Peter Dalgaard, Professor
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: [hidden email]  Priv: [hidden email]
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Confused about NAMED

Peter Dalgaard-2

On Nov 24, 2011, at 14:05 , Matthew Dowle wrote:

> Since list() is primitive I tried to construct a data.frame starting with
> list() [since structure() isn't primitive], but then merely adding an
> attribute seems to set NAMED==2 too ?

Yes. As soon as there is the slightest risk of having (had) two references to the same object NAMED==2 and it is never reduced. While your mind is boggling, I might boggle it a bit more:

> z <- 1:10
> .Internal(inspect(z))
@116e11788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
> m <- mean(z)
> .Internal(inspect(z))
@116e11788 13 INTSXP g0c4 [NAM(2)] (len=10, tl=0) 1,2,3,4,5,...

This happens because while mean() is running, there is a second reference to z, namely mean's argument x. (With languages like R, you have no insurance that there will be no changes to the global environment while a function call is being evaluated, so bugs can bite in both places -- z or x.)

There are many of these cases where you might pragmatically want to override the default NAMED logic, but you'd be stepping into treacherous waters. Luke has probably been giving these matters quite some thought in connection with his compiler project.

--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Confused about NAMED

Matthew Dowle
>
> On Nov 24, 2011, at 14:05 , Matthew Dowle wrote:
>
>> Since list() is primitive I tried to construct a data.frame starting
>> with
>> list() [since structure() isn't primitive], but then merely adding an
>> attribute seems to set NAMED==2 too ?
>
> Yes. As soon as there is the slightest risk of having (had) two references
> to the same object NAMED==2 and it is never reduced. While your mind is
> boggling, I might boggle it a bit more:
>
>> z <- 1:10
>> .Internal(inspect(z))
> @116e11788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
>> m <- mean(z)
>> .Internal(inspect(z))
> @116e11788 13 INTSXP g0c4 [NAM(2)] (len=10, tl=0) 1,2,3,4,5,...
>
> This happens because while mean() is running, there is a second reference
> to z, namely mean's argument x. (With languages like R, you have no
> insurance that there will be no changes to the global environment while a
> function call is being evaluated, so bugs can bite in both places -- z or
> x.)
>
> There are many of these cases where you might pragmatically want to
> override the default NAMED logic, but you'd be stepping into treacherous
> waters. Luke has probably been giving these matters quite some thought in
> connection with his compiler project.

Ok, very interesting. Think I'm there.
Thanks for all the info.

Matthew

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Confused about NAMED

Simon Urbanek
In reply to this post by Matthew Dowle

On Nov 24, 2011, at 8:05 AM, Matthew Dowle wrote:

>>
>> On Nov 24, 2011, at 12:34 , Matthew Dowle wrote:
>>
>>>>
>>>> On Nov 24, 2011, at 11:13 , Matthew Dowle wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I expected NAMED to be 1 in all these three cases. It is for one of
>>>>> them,
>>>>> but not the other two?
>>>>>
>>>>>> R --vanilla
>>>>> R version 2.14.0 (2011-10-31)
>>>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>>>>
>>>>>> x = 1L
>>>>>> .Internal(inspect(x))   # why NAM(2)? expected NAM(1)
>>>>> @2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1
>>>>>
>>>>>> y = 1:10
>>>>>> .Internal(inspect(y))   # NAM(1) as expected but why different to x?
>>>>> @272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
>>>>>
>>>>>> z = data.frame()
>>>>>> .Internal(inspect(z))   # why NAM(2)? expected NAM(1)
>>>>> @24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0)
>>>>> ATTRIB:
>>>>> @24fc270 02 LISTSXP g0c0 []
>>>>>  TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>>>>  @24fc334 16 STRSXP g0c0 [] (len=0, tl=0)
>>>>>  TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
>>>>>  @24fc318 13 INTSXP g0c0 [] (len=0, tl=0)
>>>>>  TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
>>>>>  @25be500 16 STRSXP g0c1 [] (len=1, tl=0)
>>>>>    @1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
>>>>>
>>>>> It's a little difficult to search for the word "named" but I tried and
>>>>> found this in R-ints :
>>>>>
>>>>>  "Note that optimizing NAMED = 1 is only effective within a primitive
>>>>> (as the closure wrapper of a .Internal will set NAMED = 2 when the
>>>>> promise to the argument is evaluated)"
>>>>>
>>>>> So might it be that just looking at NAMED using .Internal(inspect())
>>>>> is
>>>>> setting NAMED=2?  But if so, why does y have NAMED==1?
>>>>
>>>> This is tricky business... I'm not quite sure I'll get it right, but
>>>> let's
>>>> try
>>>>
>>>> When you are assigning a constant, the value you assign is already part
>>>> of
>>>> the assignment expression, so if you want to modify it, you must
>>>> duplicate. So NAMED==2 on z <- 1 is basically to prevent you from
>>>> accidentally "changing the value of 1". If it weren't, then you could
>>>> get
>>>> bitten by code like for(i in 1:2) {z <- 1; if(i==1) z[1] <- 2}.
>>>>
>>>> If you're assigning the result of a computation, then the object only
>>>> exists once, so
>>>> z <- 0+1  gets NAMED==1.
>>>>
>>>> However, if the computation is done by returning a named value from
>>>> within
>>>> a function, as in
>>>>
>>>>> f <- function(){v <- 1+0; v}
>>>>> z <- f()
>>>>
>>>> then again NAMED==2. This is because the side effects of the function
>>>> _might_ result in something having a hold on the function environment,
>>>> e.g. if we had
>>>>
>>>> e <- NULL
>>>> f <- function(){e <<-environment(); v <- 1+0; v}
>>>> z <- f()
>>>>
>>>> then z[1] <- 5 would change e$v too. As it happens, there aren't any
>>>> side
>>>> effects in the forme case, but R loses track and assumes the worst.
>>>>
>>>
>>> Thanks a lot, think I follow. That explains x vs y, but why is z
>>> NAMED==2?
>>> The result of data.frame() is an object that exists once (similar to
>>> 1:10)
>>> so shouldn't it be NAMED==1 too?  Or, R loses track and assumes the
>>> worst
>>> even on its own functions such as data.frame()?
>>
>> R loses track. I suspect that is really all it can do without actual
>> reference counting. The function data.frame is more than 150 lines of
>> code, and if any of those end up invoking user code, possibly via a class
>> method, you can't tell definitively whether or not the evaluation
>> environment dies at the return.
>
> Ohhh, think I see now. After Duncan's reply I was going to ask if it was
> possible to change data.frame() to be primitive so it could set NAMED=1.
> But it seems primitive functions can't use R code so data.frame() would
> need to be ported to C. Ok! - not quick or easy, and not without
> consideable risk. And, data.frame() can invoke user code inside it anyway
> then.
>
> Since list() is primitive I tried to construct a data.frame starting with
> list() [since structure() isn't primitive], but then merely adding an
> attribute seems to set NAMED==2 too ?
>

Yes, because attr(x,y) <- z is the same as

`*tmp*` <- x
x <- `attr<-`(`*tmp*`, y, z)
rm(`*tmp*`)

so there are two references to the data frame: one in DF and one in `*tmp*`. It is the first line that causes the NAMED bump. And, yes, it's real:

> `f<-`=function(x,value) { print(ls(parent.frame())); x<-value }
> x=1
> f(x)=1
[1] "*tmp*" "f<-"   "x"    

You could skip that by using the function directly (I don't think it's recommended, though):

> .Internal(inspect(l <- list(a=1)))
@1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
  @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
ATTRIB:
  @100b6e748 02 LISTSXP g0c0 []
    TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
    @1028c82c8 16 STRSXP g0c1 [] (len=1, tl=0)
      @1009cd388 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
> .Internal(inspect(`names<-`(l, "b")))
@1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
  @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
ATTRIB:
  @100b6e748 02 LISTSXP g0c0 []
    TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
    @1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
      @100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b"
> .Internal(inspect(l))
@1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
  @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
ATTRIB:
  @100b6e748 02 LISTSXP g0c0 []
    TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
    @1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
      @100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b"

Cheers,
Simon



>> DF = list(a=1:3,b=4:6)
>> .Internal(inspect(DF))     # so far so good: NAM(1)
> @25149e0 19 VECSXP g0c1 [NAM(1),ATT] (len=2, tl=0)
>  @263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
>  @263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
> ATTRIB:
>  @2457984 02 LISTSXP g0c0 []
>    TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>    @25149c0 16 STRSXP g0c1 [] (len=2, tl=0)
>      @1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
>      @1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
>>
>> attr(DF,"foo") <- "bar"    # just adding an attribute sets NAM(2) ?
>> .Internal(inspect(DF))
> @25149e0 19 VECSXP g0c1 [NAM(2),ATT] (len=2, tl=0)
>  @263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
>  @263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
> ATTRIB:
>  @2457984 02 LISTSXP g0c0 []
>    TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>    @25149c0 16 STRSXP g0c1 [] (len=2, tl=0)
>      @1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
>      @1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
>    TAG: @245732c 01 SYMSXP g0c0 [] "foo"
>    @25148a0 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
>      @2514920 09 CHARSXP g0c1 [gp=0x20] "bar"
>
>
> Matthew
>
>
>> --
>> Peter Dalgaard, Professor
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Email: [hidden email]  Priv: [hidden email]
>>
>>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Confused about NAMED

luke-tierney
The details of complex assignment expressions are fairly intricate.  I
wrote up some notes ont his a couple of months back and have meant to
get them into the internals manual but have not gotten around to it
yet.  I'll see if I can get to it in the next week or two and will
send a note to this thread wehen I do. In terms of the issues
discussed so far

   Calling a foo<- function directly is not a good idea unless you
   really undestand what is going on in the assignment mechanism in
   general and in the particular foo<- function. It is definitely not
   something to be done in routine programming unless you like
   unpleasant surprises.

   attr<- could probably be modified to avine the NAMED increment in
   this example, but I'd want to think that through fairly carefully
   before making such a change.  (Most foo<- functions that are
   primitives are written to that they avoid a NAMED increment when
   used in an assignment expression, but a few are not -- I believe
   these are almost all, maybe even all, oversights, but again I
   wouldn't want to make any changes without ceareful review.)

Best,

luke

On Thu, 24 Nov 2011, Simon Urbanek wrote:

>
> On Nov 24, 2011, at 8:05 AM, Matthew Dowle wrote:
>
>>>
>>> On Nov 24, 2011, at 12:34 , Matthew Dowle wrote:
>>>
>>>>>
>>>>> On Nov 24, 2011, at 11:13 , Matthew Dowle wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I expected NAMED to be 1 in all these three cases. It is for one of
>>>>>> them,
>>>>>> but not the other two?
>>>>>>
>>>>>>> R --vanilla
>>>>>> R version 2.14.0 (2011-10-31)
>>>>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>>>>>
>>>>>>> x = 1L
>>>>>>> .Internal(inspect(x))   # why NAM(2)? expected NAM(1)
>>>>>> @2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1
>>>>>>
>>>>>>> y = 1:10
>>>>>>> .Internal(inspect(y))   # NAM(1) as expected but why different to x?
>>>>>> @272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
>>>>>>
>>>>>>> z = data.frame()
>>>>>>> .Internal(inspect(z))   # why NAM(2)? expected NAM(1)
>>>>>> @24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0)
>>>>>> ATTRIB:
>>>>>> @24fc270 02 LISTSXP g0c0 []
>>>>>>  TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>>>>>  @24fc334 16 STRSXP g0c0 [] (len=0, tl=0)
>>>>>>  TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
>>>>>>  @24fc318 13 INTSXP g0c0 [] (len=0, tl=0)
>>>>>>  TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
>>>>>>  @25be500 16 STRSXP g0c1 [] (len=1, tl=0)
>>>>>>    @1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
>>>>>>
>>>>>> It's a little difficult to search for the word "named" but I tried and
>>>>>> found this in R-ints :
>>>>>>
>>>>>>  "Note that optimizing NAMED = 1 is only effective within a primitive
>>>>>> (as the closure wrapper of a .Internal will set NAMED = 2 when the
>>>>>> promise to the argument is evaluated)"
>>>>>>
>>>>>> So might it be that just looking at NAMED using .Internal(inspect())
>>>>>> is
>>>>>> setting NAMED=2?  But if so, why does y have NAMED==1?
>>>>>
>>>>> This is tricky business... I'm not quite sure I'll get it right, but
>>>>> let's
>>>>> try
>>>>>
>>>>> When you are assigning a constant, the value you assign is already part
>>>>> of
>>>>> the assignment expression, so if you want to modify it, you must
>>>>> duplicate. So NAMED==2 on z <- 1 is basically to prevent you from
>>>>> accidentally "changing the value of 1". If it weren't, then you could
>>>>> get
>>>>> bitten by code like for(i in 1:2) {z <- 1; if(i==1) z[1] <- 2}.
>>>>>
>>>>> If you're assigning the result of a computation, then the object only
>>>>> exists once, so
>>>>> z <- 0+1  gets NAMED==1.
>>>>>
>>>>> However, if the computation is done by returning a named value from
>>>>> within
>>>>> a function, as in
>>>>>
>>>>>> f <- function(){v <- 1+0; v}
>>>>>> z <- f()
>>>>>
>>>>> then again NAMED==2. This is because the side effects of the function
>>>>> _might_ result in something having a hold on the function environment,
>>>>> e.g. if we had
>>>>>
>>>>> e <- NULL
>>>>> f <- function(){e <<-environment(); v <- 1+0; v}
>>>>> z <- f()
>>>>>
>>>>> then z[1] <- 5 would change e$v too. As it happens, there aren't any
>>>>> side
>>>>> effects in the forme case, but R loses track and assumes the worst.
>>>>>
>>>>
>>>> Thanks a lot, think I follow. That explains x vs y, but why is z
>>>> NAMED==2?
>>>> The result of data.frame() is an object that exists once (similar to
>>>> 1:10)
>>>> so shouldn't it be NAMED==1 too?  Or, R loses track and assumes the
>>>> worst
>>>> even on its own functions such as data.frame()?
>>>
>>> R loses track. I suspect that is really all it can do without actual
>>> reference counting. The function data.frame is more than 150 lines of
>>> code, and if any of those end up invoking user code, possibly via a class
>>> method, you can't tell definitively whether or not the evaluation
>>> environment dies at the return.
>>
>> Ohhh, think I see now. After Duncan's reply I was going to ask if it was
>> possible to change data.frame() to be primitive so it could set NAMED=1.
>> But it seems primitive functions can't use R code so data.frame() would
>> need to be ported to C. Ok! - not quick or easy, and not without
>> consideable risk. And, data.frame() can invoke user code inside it anyway
>> then.
>>
>> Since list() is primitive I tried to construct a data.frame starting with
>> list() [since structure() isn't primitive], but then merely adding an
>> attribute seems to set NAMED==2 too ?
>>
>
> Yes, because attr(x,y) <- z is the same as
>
> `*tmp*` <- x
> x <- `attr<-`(`*tmp*`, y, z)
> rm(`*tmp*`)
>
> so there are two references to the data frame: one in DF and one in `*tmp*`. It is the first line that causes the NAMED bump. And, yes, it's real:
>
>> `f<-`=function(x,value) { print(ls(parent.frame())); x<-value }
>> x=1
>> f(x)=1
> [1] "*tmp*" "f<-"   "x"
>
> You could skip that by using the function directly (I don't think it's recommended, though):
>
>> .Internal(inspect(l <- list(a=1)))
> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
>  @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
> ATTRIB:
>  @100b6e748 02 LISTSXP g0c0 []
>    TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>    @1028c82c8 16 STRSXP g0c1 [] (len=1, tl=0)
>      @1009cd388 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
>> .Internal(inspect(`names<-`(l, "b")))
> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
>  @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
> ATTRIB:
>  @100b6e748 02 LISTSXP g0c0 []
>    TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>    @1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
>      @100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b"
>> .Internal(inspect(l))
> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
>  @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
> ATTRIB:
>  @100b6e748 02 LISTSXP g0c0 []
>    TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>    @1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
>      @100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b"
>
> Cheers,
> Simon
>
>
>
>>> DF = list(a=1:3,b=4:6)
>>> .Internal(inspect(DF))     # so far so good: NAM(1)
>> @25149e0 19 VECSXP g0c1 [NAM(1),ATT] (len=2, tl=0)
>>  @263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
>>  @263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
>> ATTRIB:
>>  @2457984 02 LISTSXP g0c0 []
>>    TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>    @25149c0 16 STRSXP g0c1 [] (len=2, tl=0)
>>      @1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
>>      @1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
>>>
>>> attr(DF,"foo") <- "bar"    # just adding an attribute sets NAM(2) ?
>>> .Internal(inspect(DF))
>> @25149e0 19 VECSXP g0c1 [NAM(2),ATT] (len=2, tl=0)
>>  @263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
>>  @263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
>> ATTRIB:
>>  @2457984 02 LISTSXP g0c0 []
>>    TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>    @25149c0 16 STRSXP g0c1 [] (len=2, tl=0)
>>      @1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
>>      @1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
>>    TAG: @245732c 01 SYMSXP g0c0 [] "foo"
>>    @25148a0 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
>>      @2514920 09 CHARSXP g0c1 [gp=0x20] "bar"
>>
>>
>> Matthew
>>
>>
>>> --
>>> Peter Dalgaard, Professor
>>> Center for Statistics, Copenhagen Business School
>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>> Phone: (+45)38153501
>>> Email: [hidden email]  Priv: [hidden email]
>>>
>>>
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   [hidden email]
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Confused about NAMED

Matthew Dowle
In reply to this post by Simon Urbanek
>
> On Nov 24, 2011, at 8:05 AM, Matthew Dowle wrote:
>
>>>
>>> On Nov 24, 2011, at 12:34 , Matthew Dowle wrote:
>>>
>>>>>
>>>>> On Nov 24, 2011, at 11:13 , Matthew Dowle wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I expected NAMED to be 1 in all these three cases. It is for one of
>>>>>> them,
>>>>>> but not the other two?
>>>>>>
>>>>>>> R --vanilla
>>>>>> R version 2.14.0 (2011-10-31)
>>>>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>>>>>
>>>>>>> x = 1L
>>>>>>> .Internal(inspect(x))   # why NAM(2)? expected NAM(1)
>>>>>> @2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1
>>>>>>
>>>>>>> y = 1:10
>>>>>>> .Internal(inspect(y))   # NAM(1) as expected but why different to
>>>>>>> x?
>>>>>> @272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
>>>>>>
>>>>>>> z = data.frame()
>>>>>>> .Internal(inspect(z))   # why NAM(2)? expected NAM(1)
>>>>>> @24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0)
>>>>>> ATTRIB:
>>>>>> @24fc270 02 LISTSXP g0c0 []
>>>>>>  TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>>>>>  @24fc334 16 STRSXP g0c0 [] (len=0, tl=0)
>>>>>>  TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
>>>>>>  @24fc318 13 INTSXP g0c0 [] (len=0, tl=0)
>>>>>>  TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
>>>>>>  @25be500 16 STRSXP g0c1 [] (len=1, tl=0)
>>>>>>    @1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
>>>>>>
>>>>>> It's a little difficult to search for the word "named" but I tried
>>>>>> and
>>>>>> found this in R-ints :
>>>>>>
>>>>>>  "Note that optimizing NAMED = 1 is only effective within a
>>>>>> primitive
>>>>>> (as the closure wrapper of a .Internal will set NAMED = 2 when the
>>>>>> promise to the argument is evaluated)"
>>>>>>
>>>>>> So might it be that just looking at NAMED using .Internal(inspect())
>>>>>> is
>>>>>> setting NAMED=2?  But if so, why does y have NAMED==1?
>>>>>
>>>>> This is tricky business... I'm not quite sure I'll get it right, but
>>>>> let's
>>>>> try
>>>>>
>>>>> When you are assigning a constant, the value you assign is already
>>>>> part
>>>>> of
>>>>> the assignment expression, so if you want to modify it, you must
>>>>> duplicate. So NAMED==2 on z <- 1 is basically to prevent you from
>>>>> accidentally "changing the value of 1". If it weren't, then you could
>>>>> get
>>>>> bitten by code like for(i in 1:2) {z <- 1; if(i==1) z[1] <- 2}.
>>>>>
>>>>> If you're assigning the result of a computation, then the object only
>>>>> exists once, so
>>>>> z <- 0+1  gets NAMED==1.
>>>>>
>>>>> However, if the computation is done by returning a named value from
>>>>> within
>>>>> a function, as in
>>>>>
>>>>>> f <- function(){v <- 1+0; v}
>>>>>> z <- f()
>>>>>
>>>>> then again NAMED==2. This is because the side effects of the function
>>>>> _might_ result in something having a hold on the function
>>>>> environment,
>>>>> e.g. if we had
>>>>>
>>>>> e <- NULL
>>>>> f <- function(){e <<-environment(); v <- 1+0; v}
>>>>> z <- f()
>>>>>
>>>>> then z[1] <- 5 would change e$v too. As it happens, there aren't any
>>>>> side
>>>>> effects in the forme case, but R loses track and assumes the worst.
>>>>>
>>>>
>>>> Thanks a lot, think I follow. That explains x vs y, but why is z
>>>> NAMED==2?
>>>> The result of data.frame() is an object that exists once (similar to
>>>> 1:10)
>>>> so shouldn't it be NAMED==1 too?  Or, R loses track and assumes the
>>>> worst
>>>> even on its own functions such as data.frame()?
>>>
>>> R loses track. I suspect that is really all it can do without actual
>>> reference counting. The function data.frame is more than 150 lines of
>>> code, and if any of those end up invoking user code, possibly via a
>>> class
>>> method, you can't tell definitively whether or not the evaluation
>>> environment dies at the return.
>>
>> Ohhh, think I see now. After Duncan's reply I was going to ask if it was
>> possible to change data.frame() to be primitive so it could set NAMED=1.
>> But it seems primitive functions can't use R code so data.frame() would
>> need to be ported to C. Ok! - not quick or easy, and not without
>> consideable risk. And, data.frame() can invoke user code inside it
>> anyway
>> then.
>>
>> Since list() is primitive I tried to construct a data.frame starting
>> with
>> list() [since structure() isn't primitive], but then merely adding an
>> attribute seems to set NAMED==2 too ?
>>
>
> Yes, because attr(x,y) <- z is the same as
>
> `*tmp*` <- x
> x <- `attr<-`(`*tmp*`, y, z)
> rm(`*tmp*`)
>
> so there are two references to the data frame: one in DF and one in
> `*tmp*`. It is the first line that causes the NAMED bump. And, yes, it's
> real:
>
>> `f<-`=function(x,value) { print(ls(parent.frame())); x<-value }
>> x=1
>> f(x)=1
> [1] "*tmp*" "f<-"   "x"
>
> You could skip that by using the function directly (I don't think it's
> recommended, though):
>
>> .Internal(inspect(l <- list(a=1)))
> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
>   @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
> ATTRIB:
>   @100b6e748 02 LISTSXP g0c0 []
>     TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>     @1028c82c8 16 STRSXP g0c1 [] (len=1, tl=0)
>       @1009cd388 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
>> .Internal(inspect(`names<-`(l, "b")))
> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
>   @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
> ATTRIB:
>   @100b6e748 02 LISTSXP g0c0 []
>     TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>     @1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
>       @100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b"
>> .Internal(inspect(l))
> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
>   @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
> ATTRIB:
>   @100b6e748 02 LISTSXP g0c0 []
>     TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>     @1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
>       @100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b"
>

Interesting, I tried it. I found that setting the "row.names" attribute
that way keeps NAMED==1 ok, and that setting "class" attribute keeps
NAMED==1 ok too. Fantastic! But, it seems that merely printing it on the
console (when the class is set) bumps NAMED to 2. Here is the output :

> DF = list(a=1:3,b=4:6)
> `attr<-`(DF,"row.names",.set_row_names(3))
$a
[1] 1 2 3

$b
[1] 4 5 6

attr(,"row.names")
[1] 1 2 3
> .Internal(inspect(DF))    # great, NAM(1)
@261e730 19 VECSXP g0c1 [NAM(1),ATT] (len=2, tl=0)
  @2abd088 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
  @2abd060 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
ATTRIB:
  @258d4f4 02 LISTSXP g0c0 []
    TAG: @1612120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
    @261e710 16 STRSXP g0c1 [NAM(2)] (len=2, tl=0)
      @17a86f8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
      @1766868 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
    TAG: @1612040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
    @261e5d0 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) -2147483648,-3
> .Internal(inspect(`attr<-`(DF,"class","data.frame")))
@261e730 19 VECSXP g0c1 [OBJ,NAM(1),ATT] (len=2, tl=0)  # great, NAM(1)
  @2abd088 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
  @2abd060 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
ATTRIB:
  @258d4f4 02 LISTSXP g0c0 []
    TAG: @1612120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
    @261e710 16 STRSXP g0c1 [NAM(2)] (len=2, tl=0)
      @17a86f8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
      @1766868 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
    TAG: @1612040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
    @261e5d0 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) -2147483648,-3
    TAG: @1612388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
    @2a758e8 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
      @1647f38 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
> .Internal(inspect(DF))         # Great, NAM(1) still
@261e730 19 VECSXP g0c1 [OBJ,NAM(1),ATT] (len=2, tl=0)
  @2abd088 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
  @2abd060 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
ATTRIB:
  @258d4f4 02 LISTSXP g0c0 []
    TAG: @1612120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
    @261e710 16 STRSXP g0c1 [NAM(2)] (len=2, tl=0)
      @17a86f8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
      @1766868 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
    TAG: @1612040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
    @261e5d0 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) -2147483648,-3
    TAG: @1612388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
    @2a758e8 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
      @1647f38 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
> DF
  a b
1 1 4
2 2 5
3 3 6
> .Internal(inspect(DF))  # just looking at it changes NAMED to 2 ?
@261e730 19 VECSXP g0c1 [OBJ,MARK,NAM(2),ATT] (len=2, tl=0)
  @2abd088 13 INTSXP g0c2 [MARK,NAM(2)] (len=3, tl=0) 1,2,3
  @2abd060 13 INTSXP g0c2 [MARK,NAM(2)] (len=3, tl=0) 4,5,6
ATTRIB:
  @258d4f4 02 LISTSXP g0c0 [MARK]
    TAG: @1612120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
    @261e710 16 STRSXP g0c1 [MARK,NAM(2)] (len=2, tl=0)
      @17a86f8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
      @1766868 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
    TAG: @1612040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
    @261e5d0 13 INTSXP g0c1 [MARK,NAM(2)] (len=2, tl=0) -2147483648,-3
    TAG: @1612388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
    @2a758e8 16 STRSXP g0c1 [MARK,NAM(2)] (len=1, tl=0)
      @1647f38 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"

> identical(DF, data.frame(a=1:3,b=4:6))
[1] TRUE

Matthew

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Confused about NAMED

Prof Brian Ripley
In reply to this post by Simon Urbanek
On Thu, 24 Nov 2011, Simon Urbanek wrote:

>
> On Nov 24, 2011, at 8:05 AM, Matthew Dowle wrote:
>
>>>
>>> On Nov 24, 2011, at 12:34 , Matthew Dowle wrote:
>>>
>>>>>
>>>>> On Nov 24, 2011, at 11:13 , Matthew Dowle wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I expected NAMED to be 1 in all these three cases. It is for one of
>>>>>> them,
>>>>>> but not the other two?
>>>>>>
>>>>>>> R --vanilla
>>>>>> R version 2.14.0 (2011-10-31)
>>>>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>>>>>
>>>>>>> x = 1L
>>>>>>> .Internal(inspect(x))   # why NAM(2)? expected NAM(1)
>>>>>> @2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1
>>>>>>
>>>>>>> y = 1:10
>>>>>>> .Internal(inspect(y))   # NAM(1) as expected but why different to x?
>>>>>> @272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
>>>>>>
>>>>>>> z = data.frame()
>>>>>>> .Internal(inspect(z))   # why NAM(2)? expected NAM(1)
>>>>>> @24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0)
>>>>>> ATTRIB:
>>>>>> @24fc270 02 LISTSXP g0c0 []
>>>>>>  TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>>>>>  @24fc334 16 STRSXP g0c0 [] (len=0, tl=0)
>>>>>>  TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
>>>>>>  @24fc318 13 INTSXP g0c0 [] (len=0, tl=0)
>>>>>>  TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
>>>>>>  @25be500 16 STRSXP g0c1 [] (len=1, tl=0)
>>>>>>    @1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
>>>>>>
>>>>>> It's a little difficult to search for the word "named" but I tried and
>>>>>> found this in R-ints :
>>>>>>
>>>>>>  "Note that optimizing NAMED = 1 is only effective within a primitive
>>>>>> (as the closure wrapper of a .Internal will set NAMED = 2 when the
>>>>>> promise to the argument is evaluated)"
>>>>>>
>>>>>> So might it be that just looking at NAMED using .Internal(inspect())
>>>>>> is
>>>>>> setting NAMED=2?  But if so, why does y have NAMED==1?
>>>>>
>>>>> This is tricky business... I'm not quite sure I'll get it right, but
>>>>> let's
>>>>> try
>>>>>
>>>>> When you are assigning a constant, the value you assign is already part
>>>>> of
>>>>> the assignment expression, so if you want to modify it, you must
>>>>> duplicate. So NAMED==2 on z <- 1 is basically to prevent you from
>>>>> accidentally "changing the value of 1". If it weren't, then you could
>>>>> get
>>>>> bitten by code like for(i in 1:2) {z <- 1; if(i==1) z[1] <- 2}.
>>>>>
>>>>> If you're assigning the result of a computation, then the object only
>>>>> exists once, so
>>>>> z <- 0+1  gets NAMED==1.
>>>>>
>>>>> However, if the computation is done by returning a named value from
>>>>> within
>>>>> a function, as in
>>>>>
>>>>>> f <- function(){v <- 1+0; v}
>>>>>> z <- f()
>>>>>
>>>>> then again NAMED==2. This is because the side effects of the function
>>>>> _might_ result in something having a hold on the function environment,
>>>>> e.g. if we had
>>>>>
>>>>> e <- NULL
>>>>> f <- function(){e <<-environment(); v <- 1+0; v}
>>>>> z <- f()
>>>>>
>>>>> then z[1] <- 5 would change e$v too. As it happens, there aren't any
>>>>> side
>>>>> effects in the forme case, but R loses track and assumes the worst.
>>>>>
>>>>
>>>> Thanks a lot, think I follow. That explains x vs y, but why is z
>>>> NAMED==2?
>>>> The result of data.frame() is an object that exists once (similar to
>>>> 1:10)
>>>> so shouldn't it be NAMED==1 too?  Or, R loses track and assumes the
>>>> worst
>>>> even on its own functions such as data.frame()?
>>>
>>> R loses track. I suspect that is really all it can do without actual
>>> reference counting. The function data.frame is more than 150 lines of
>>> code, and if any of those end up invoking user code, possibly via a class
>>> method, you can't tell definitively whether or not the evaluation
>>> environment dies at the return.
>>
>> Ohhh, think I see now. After Duncan's reply I was going to ask if it was
>> possible to change data.frame() to be primitive so it could set NAMED=1.
>> But it seems primitive functions can't use R code so data.frame() would
>> need to be ported to C. Ok! - not quick or easy, and not without
>> consideable risk. And, data.frame() can invoke user code inside it anyway
>> then.

Maybe some review of the 'R Internals' manual about what a primitive
function is would be desirable.  Converting such a function to C would
ossify it, which is the major reason it has not been done (it has been
contemplated).

>> Since list() is primitive I tried to construct a data.frame starting with
>> list() [since structure() isn't primitive], but then merely adding an
>> attribute seems to set NAMED==2 too ?
>>
>
> Yes, because attr(x,y) <- z is the same as
>
> `*tmp*` <- x
> x <- `attr<-`(`*tmp*`, y, z)
> rm(`*tmp*`)

Only if it were an interpreted function.

> so there are two references to the data frame: one in DF and one in
> `*tmp*`. It is the first line that causes the NAMED bump. And, yes,
> it's real:
>
>> `f<-`=function(x,value) { print(ls(parent.frame())); x<-value }
>> x=1
>> f(x)=1
> [1] "*tmp*" "f<-"   "x"

You have just explained why interpreted replacement functions set
NAMED=2, but this does not apply to primitives.

To help convince you, consider

> d <- 1:2
> attributes(d) <- list(x=13)
> d
[1] 1 2
attr(,"x")
[1] 13
> .Internal(inspect(d))
@11be748 13 INTSXP g0c1 [NAM(1),ATT] (len=2, tl=0) 1,2
ATTRIB:
   @1552054 02 LISTSXP g0c0 []
     TAG: @102b1c0 01 SYMSXP g0c0 [MARK,NAM(2)] "x"
     @11be768 14 REALSXP g0c1 [] (len=1, tl=0) 13

Now, as to why attr<- (which is primitive) does what it does you will
need to read (and understand) the code.

>
> You could skip that by using the function directly (I don't think it's recommended, though):
>
>> .Internal(inspect(l <- list(a=1)))
> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
>  @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
> ATTRIB:
>  @100b6e748 02 LISTSXP g0c0 []
>    TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>    @1028c82c8 16 STRSXP g0c1 [] (len=1, tl=0)
>      @1009cd388 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
>> .Internal(inspect(`names<-`(l, "b")))
> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
>  @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
> ATTRIB:
>  @100b6e748 02 LISTSXP g0c0 []
>    TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>    @1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
>      @100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b"
>> .Internal(inspect(l))
> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
>  @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
> ATTRIB:
>  @100b6e748 02 LISTSXP g0c0 []
>    TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>    @1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
>      @100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b"
>
> Cheers,
> Simon
>
>
>
>>> DF = list(a=1:3,b=4:6)
>>> .Internal(inspect(DF))     # so far so good: NAM(1)
>> @25149e0 19 VECSXP g0c1 [NAM(1),ATT] (len=2, tl=0)
>>  @263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
>>  @263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
>> ATTRIB:
>>  @2457984 02 LISTSXP g0c0 []
>>    TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>    @25149c0 16 STRSXP g0c1 [] (len=2, tl=0)
>>      @1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
>>      @1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
>>>
>>> attr(DF,"foo") <- "bar"    # just adding an attribute sets NAM(2) ?
>>> .Internal(inspect(DF))
>> @25149e0 19 VECSXP g0c1 [NAM(2),ATT] (len=2, tl=0)
>>  @263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
>>  @263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
>> ATTRIB:
>>  @2457984 02 LISTSXP g0c0 []
>>    TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>    @25149c0 16 STRSXP g0c1 [] (len=2, tl=0)
>>      @1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
>>      @1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
>>    TAG: @245732c 01 SYMSXP g0c0 [] "foo"
>>    @25148a0 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
>>      @2514920 09 CHARSXP g0c1 [gp=0x20] "bar"
>>
>>
>> Matthew
>>
>>
>>> --
>>> Peter Dalgaard, Professor
>>> Center for Statistics, Copenhagen Business School
>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>> Phone: (+45)38153501
>>> Email: [hidden email]  Priv: [hidden email]
>>>
>>>
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Confused about NAMED

Simon Urbanek
On Nov 24, 2011, at 1:48 PM, Prof Brian Ripley wrote:

> On Thu, 24 Nov 2011, Simon Urbanek wrote:
>
>>
>> On Nov 24, 2011, at 8:05 AM, Matthew Dowle wrote:
>>
>>>>
>>>> On Nov 24, 2011, at 12:34 , Matthew Dowle wrote:
>>>>
>>>>>>
>>>>>> On Nov 24, 2011, at 11:13 , Matthew Dowle wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I expected NAMED to be 1 in all these three cases. It is for one of
>>>>>>> them,
>>>>>>> but not the other two?
>>>>>>>
>>>>>>>> R --vanilla
>>>>>>> R version 2.14.0 (2011-10-31)
>>>>>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>>>>>>
>>>>>>>> x = 1L
>>>>>>>> .Internal(inspect(x))   # why NAM(2)? expected NAM(1)
>>>>>>> @2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1
>>>>>>>
>>>>>>>> y = 1:10
>>>>>>>> .Internal(inspect(y))   # NAM(1) as expected but why different to x?
>>>>>>> @272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
>>>>>>>
>>>>>>>> z = data.frame()
>>>>>>>> .Internal(inspect(z))   # why NAM(2)? expected NAM(1)
>>>>>>> @24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0)
>>>>>>> ATTRIB:
>>>>>>> @24fc270 02 LISTSXP g0c0 []
>>>>>>> TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>>>>>> @24fc334 16 STRSXP g0c0 [] (len=0, tl=0)
>>>>>>> TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
>>>>>>> @24fc318 13 INTSXP g0c0 [] (len=0, tl=0)
>>>>>>> TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
>>>>>>> @25be500 16 STRSXP g0c1 [] (len=1, tl=0)
>>>>>>>   @1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
>>>>>>>
>>>>>>> It's a little difficult to search for the word "named" but I tried and
>>>>>>> found this in R-ints :
>>>>>>>
>>>>>>> "Note that optimizing NAMED = 1 is only effective within a primitive
>>>>>>> (as the closure wrapper of a .Internal will set NAMED = 2 when the
>>>>>>> promise to the argument is evaluated)"
>>>>>>>
>>>>>>> So might it be that just looking at NAMED using .Internal(inspect())
>>>>>>> is
>>>>>>> setting NAMED=2?  But if so, why does y have NAMED==1?
>>>>>>
>>>>>> This is tricky business... I'm not quite sure I'll get it right, but
>>>>>> let's
>>>>>> try
>>>>>>
>>>>>> When you are assigning a constant, the value you assign is already part
>>>>>> of
>>>>>> the assignment expression, so if you want to modify it, you must
>>>>>> duplicate. So NAMED==2 on z <- 1 is basically to prevent you from
>>>>>> accidentally "changing the value of 1". If it weren't, then you could
>>>>>> get
>>>>>> bitten by code like for(i in 1:2) {z <- 1; if(i==1) z[1] <- 2}.
>>>>>>
>>>>>> If you're assigning the result of a computation, then the object only
>>>>>> exists once, so
>>>>>> z <- 0+1  gets NAMED==1.
>>>>>>
>>>>>> However, if the computation is done by returning a named value from
>>>>>> within
>>>>>> a function, as in
>>>>>>
>>>>>>> f <- function(){v <- 1+0; v}
>>>>>>> z <- f()
>>>>>>
>>>>>> then again NAMED==2. This is because the side effects of the function
>>>>>> _might_ result in something having a hold on the function environment,
>>>>>> e.g. if we had
>>>>>>
>>>>>> e <- NULL
>>>>>> f <- function(){e <<-environment(); v <- 1+0; v}
>>>>>> z <- f()
>>>>>>
>>>>>> then z[1] <- 5 would change e$v too. As it happens, there aren't any
>>>>>> side
>>>>>> effects in the forme case, but R loses track and assumes the worst.
>>>>>>
>>>>>
>>>>> Thanks a lot, think I follow. That explains x vs y, but why is z
>>>>> NAMED==2?
>>>>> The result of data.frame() is an object that exists once (similar to
>>>>> 1:10)
>>>>> so shouldn't it be NAMED==1 too?  Or, R loses track and assumes the
>>>>> worst
>>>>> even on its own functions such as data.frame()?
>>>>
>>>> R loses track. I suspect that is really all it can do without actual
>>>> reference counting. The function data.frame is more than 150 lines of
>>>> code, and if any of those end up invoking user code, possibly via a class
>>>> method, you can't tell definitively whether or not the evaluation
>>>> environment dies at the return.
>>>
>>> Ohhh, think I see now. After Duncan's reply I was going to ask if it was
>>> possible to change data.frame() to be primitive so it could set NAMED=1.
>>> But it seems primitive functions can't use R code so data.frame() would
>>> need to be ported to C. Ok! - not quick or easy, and not without
>>> consideable risk. And, data.frame() can invoke user code inside it anyway
>>> then.
>
> Maybe some review of the 'R Internals' manual about what a primitive function is would be desirable.  Converting such a function to C would ossify it, which is the major reason it has not been done (it has been contemplated).
>
>>> Since list() is primitive I tried to construct a data.frame starting with
>>> list() [since structure() isn't primitive], but then merely adding an
>>> attribute seems to set NAMED==2 too ?
>>>
>>
>> Yes, because attr(x,y) <- z is the same as
>>
>> `*tmp*` <- x
>> x <- `attr<-`(`*tmp*`, y, z)
>> rm(`*tmp*`)
>
> Only if it were an interpreted function.
>
>> so there are two references to the data frame: one in DF and one in `*tmp*`. It is the first line that causes the NAMED bump. And, yes, it's real:
>>
>>> `f<-`=function(x,value) { print(ls(parent.frame())); x<-value }
>>> x=1
>>> f(x)=1
>> [1] "*tmp*" "f<-"   "x"
>
> You have just explained why interpreted replacement functions set NAMED=2, but this does not apply to primitives.
>

It does - see eval.c l1680-2 which causes it to go through do_set which is turn bumps NAMED. I have responded only to Luke but I guess I should have included everyone..


> To help convince you, consider
>
>> d <- 1:2
>> attributes(d) <- list(x=13)
>> d
> [1] 1 2
> attr(,"x")
> [1] 13
>> .Internal(inspect(d))
> @11be748 13 INTSXP g0c1 [NAM(1),ATT] (len=2, tl=0) 1,2
> ATTRIB:
>  @1552054 02 LISTSXP g0c0 []
>    TAG: @102b1c0 01 SYMSXP g0c0 [MARK,NAM(2)] "x"
>    @11be768 14 REALSXP g0c1 [] (len=1, tl=0) 13
>
> Now, as to why attr<- (which is primitive) does what it does you will need to read (and understand) the code.
>

Because do_attributesgets duplicates (attrib.c l1178) which you can easily see:

> d <- 1:2
> .Internal(inspect(d))
@155aba8 13 INTSXP g0c1 [NAM(1)] (len=2, tl=0) 1,2
> attributes(d) <- list(x=13)
> .Internal(inspect(d))
@15dbe28 13 INTSXP g0c1 [NAM(1),ATT] (len=2, tl=0) 1,2
ATTRIB:
  @16da5a8 02 LISTSXP g0c0 []
    TAG: @660008 01 SYMSXP g0c0 [MARK,NAM(2)] "x"
    @15dbe58 14 REALSXP g0c1 [] (len=1, tl=0) 13

Note the different pointer of the value of d now -- do_attributesgets returns a duplicate with NAMED=0 so do_set assignment bumps it to 1.

Cheers,
Simon



>>
>> You could skip that by using the function directly (I don't think it's recommended, though):
>>
>>> .Internal(inspect(l <- list(a=1)))
>> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
>> @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
>> ATTRIB:
>> @100b6e748 02 LISTSXP g0c0 []
>>   TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>   @1028c82c8 16 STRSXP g0c1 [] (len=1, tl=0)
>>     @1009cd388 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
>>> .Internal(inspect(`names<-`(l, "b")))
>> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
>> @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
>> ATTRIB:
>> @100b6e748 02 LISTSXP g0c0 []
>>   TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>   @1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
>>     @100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b"
>>> .Internal(inspect(l))
>> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
>> @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
>> ATTRIB:
>> @100b6e748 02 LISTSXP g0c0 []
>>   TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>   @1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
>>     @100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b"
>>
>> Cheers,
>> Simon
>>
>>
>>
>>>> DF = list(a=1:3,b=4:6)
>>>> .Internal(inspect(DF))     # so far so good: NAM(1)
>>> @25149e0 19 VECSXP g0c1 [NAM(1),ATT] (len=2, tl=0)
>>> @263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
>>> @263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
>>> ATTRIB:
>>> @2457984 02 LISTSXP g0c0 []
>>>   TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>>   @25149c0 16 STRSXP g0c1 [] (len=2, tl=0)
>>>     @1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
>>>     @1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
>>>>
>>>> attr(DF,"foo") <- "bar"    # just adding an attribute sets NAM(2) ?
>>>> .Internal(inspect(DF))
>>> @25149e0 19 VECSXP g0c1 [NAM(2),ATT] (len=2, tl=0)
>>> @263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
>>> @263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
>>> ATTRIB:
>>> @2457984 02 LISTSXP g0c0 []
>>>   TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
>>>   @25149c0 16 STRSXP g0c1 [] (len=2, tl=0)
>>>     @1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
>>>     @1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
>>>   TAG: @245732c 01 SYMSXP g0c0 [] "foo"
>>>   @25148a0 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
>>>     @2514920 09 CHARSXP g0c1 [gp=0x20] "bar"
>>>
>>>
>>> Matthew
>>>
>>>
>>>> --
>>>> Peter Dalgaard, Professor
>>>> Center for Statistics, Copenhagen Business School
>>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>>> Phone: (+45)38153501
>>>> Email: [hidden email]  Priv: [hidden email]
>>>>
>>>>
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>>
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> --
> Brian D. Ripley,                  [hidden email]
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Loading...