random output with sub(fixed = TRUE)

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

random output with sub(fixed = TRUE)

Roger D. Peng
I've noticed what I think is curious behavior in using 'sub(fixed = TRUE)' and
was wondering if my expectation is incorrect.  Here is one example:

v <- paste(0:10, "asdf", sep = ".")
sub(".asdf", "", v, fixed = TRUE)

The results I get are

 > sub(".asdf", "", v, fixed = TRUE)
  [1] "0"               "1\0st\0\0"       "2\0<af>\001\0\0" "3\0<af>\001\0\0"
  [5] "4\0mes\0"        "5\0<ba>\001\0\0" "6\0\0\0\0\0"     "7\0\0\0m\0"
  [9] "8\0\0\0t\0"      "9\0<fe>\0\0\0"   "10\0\0\0\0\0"
 >

I expected "0" in the first entry and everything else would be unchanged.  Your
results may vary since every time I run 'sub()' in this way, I get a slightly
different answer in entires 2 through 11.

As it turns out, 'gsub(fixed = TRUE)' gives me the answer I *actually* wanted,
which was to replace the string in every entry.  But I still think the behavior
of 'sub(fixed = TRUE) is a bit odd.

 > version
          _
platform x86_64-unknown-linux-gnu
arch     x86_64
os       linux-gnu
system   x86_64, linux-gnu
status
major    2
minor    2.1
year     2005
month    12
day      20
svn rev  36812
language R
 >

-roger
--
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: random output with sub(fixed = TRUE)

Peter Dalgaard
"Roger D. Peng" <[hidden email]> writes:

> I've noticed what I think is curious behavior in using 'sub(fixed = TRUE)' and
> was wondering if my expectation is incorrect.  Here is one example:
>
> v <- paste(0:10, "asdf", sep = ".")
> sub(".asdf", "", v, fixed = TRUE)
>
> The results I get are
>
>  > sub(".asdf", "", v, fixed = TRUE)
>   [1] "0"               "1\0st\0\0"       "2\0<af>\001\0\0" "3\0<af>\001\0\0"
>   [5] "4\0mes\0"        "5\0<ba>\001\0\0" "6\0\0\0\0\0"     "7\0\0\0m\0"
>   [9] "8\0\0\0t\0"      "9\0<fe>\0\0\0"   "10\0\0\0\0\0"
>  >
>
> I expected "0" in the first entry and everything else would be unchanged.  Your
> results may vary since every time I run 'sub()' in this way, I get a slightly
> different answer in entires 2 through 11.
>
> As it turns out, 'gsub(fixed = TRUE)' gives me the answer I *actually* wanted,
> which was to replace the string in every entry.  But I still think the behavior
> of 'sub(fixed = TRUE) is a bit odd.
>
>  > version
>           _
> platform x86_64-unknown-linux-gnu
> arch     x86_64
> os       linux-gnu
> system   x86_64, linux-gnu
> status
> major    2
> minor    2.1
> year     2005
> month    12
> day      20
> svn rev  36812
> language R
>  >

Argh...

year     2005
month    12
day      21

and something like this gets discovered. It's a ritual, I tell ya, a ritual!

If you look at the output and terminate all strings at the embedded
\0, it looks much more sensible, so it should be fairly easy to spot
the cause of this bug...

--
   O__  ---- Peter Dalgaard             Ă˜ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - ([hidden email])                  FAX: (+45) 35327907

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: random output with sub(fixed = TRUE)

Roger D. Peng
Well, who am I to break this long-standing ritual? :)

Interestingly, while the printed output looks wrong, I get

 > v <- paste(0:10, "asdf", sep = ".")
 > a <- sub(".asdf", "", v, fixed = TRUE)
 > b <- as.character(0:10)
 > identical(a, b)
[1] TRUE
 >

-roger

Peter Dalgaard wrote:

> "Roger D. Peng" <[hidden email]> writes:
>
>
>>I've noticed what I think is curious behavior in using 'sub(fixed = TRUE)' and
>>was wondering if my expectation is incorrect.  Here is one example:
>>
>>v <- paste(0:10, "asdf", sep = ".")
>>sub(".asdf", "", v, fixed = TRUE)
>>
>>The results I get are
>>
>> > sub(".asdf", "", v, fixed = TRUE)
>>  [1] "0"               "1\0st\0\0"       "2\0<af>\001\0\0" "3\0<af>\001\0\0"
>>  [5] "4\0mes\0"        "5\0<ba>\001\0\0" "6\0\0\0\0\0"     "7\0\0\0m\0"
>>  [9] "8\0\0\0t\0"      "9\0<fe>\0\0\0"   "10\0\0\0\0\0"
>> >
>>
>>I expected "0" in the first entry and everything else would be unchanged.  Your
>>results may vary since every time I run 'sub()' in this way, I get a slightly
>>different answer in entires 2 through 11.
>>
>>As it turns out, 'gsub(fixed = TRUE)' gives me the answer I *actually* wanted,
>>which was to replace the string in every entry.  But I still think the behavior
>>of 'sub(fixed = TRUE) is a bit odd.
>>
>> > version
>>          _
>>platform x86_64-unknown-linux-gnu
>>arch     x86_64
>>os       linux-gnu
>>system   x86_64, linux-gnu
>>status
>>major    2
>>minor    2.1
>>year     2005
>>month    12
>>day      20
>>svn rev  36812
>>language R
>> >
>
>
> Argh...
>
> year     2005
> month    12
> day      21
>
> and something like this gets discovered. It's a ritual, I tell ya, a ritual!
>
> If you look at the output and terminate all strings at the embedded
> \0, it looks much more sensible, so it should be fairly easy to spot
> the cause of this bug...
>

--
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: random output with sub(fixed = TRUE)

Prof Brian Ripley
On Wed, 21 Dec 2005, Roger D. Peng wrote:

> Well, who am I to break this long-standing ritual? :)
>
> Interestingly, while the printed output looks wrong, I get
>
> > v <- paste(0:10, "asdf", sep = ".")
> > a <- sub(".asdf", "", v, fixed = TRUE)
> > b <- as.character(0:10)
> > identical(a, b)
> [1] TRUE
> >

identical is wrong!  R character strings have a true length and a C-style
length: print() prints the all the characters, even those after embedded
nuls.  identical uses

     if(strcmp(CHAR(STRING_ELT(x, i)),
       CHAR(STRING_ELT(y, i))) != 0)

which is C-style.

The issue is character.c:1015 whose nr gets trashed: note the first answer
in the vector is correct.  So easy to fix.

This code has been as currently for years, so I don't think this is at all
related to the release of 2.2.1.

> Peter Dalgaard wrote:
>> "Roger D. Peng" <[hidden email]> writes:
>>
>>
>>> I've noticed what I think is curious behavior in using 'sub(fixed = TRUE)' and
>>> was wondering if my expectation is incorrect.  Here is one example:
>>>
>>> v <- paste(0:10, "asdf", sep = ".")
>>> sub(".asdf", "", v, fixed = TRUE)
>>>
>>> The results I get are
>>>
>>>> sub(".asdf", "", v, fixed = TRUE)
>>>  [1] "0"               "1\0st\0\0"       "2\0<af>\001\0\0" "3\0<af>\001\0\0"
>>>  [5] "4\0mes\0"        "5\0<ba>\001\0\0" "6\0\0\0\0\0"     "7\0\0\0m\0"
>>>  [9] "8\0\0\0t\0"      "9\0<fe>\0\0\0"   "10\0\0\0\0\0"
>>>>
>>>
>>> I expected "0" in the first entry and everything else would be unchanged.  Your
>>> results may vary since every time I run 'sub()' in this way, I get a slightly
>>> different answer in entires 2 through 11.
>>>
>>> As it turns out, 'gsub(fixed = TRUE)' gives me the answer I *actually* wanted,
>>> which was to replace the string in every entry.  But I still think the behavior
>>> of 'sub(fixed = TRUE) is a bit odd.
>>>
>>>> version
>>>          _
>>> platform x86_64-unknown-linux-gnu
>>> arch     x86_64
>>> os       linux-gnu
>>> system   x86_64, linux-gnu
>>> status
>>> major    2
>>> minor    2.1
>>> year     2005
>>> month    12
>>> day      20
>>> svn rev  36812
>>> language R
>>>>
>>
>>
>> Argh...
>>
>> year     2005
>> month    12
>> day      21
>>
>> and something like this gets discovered. It's a ritual, I tell ya, a ritual!
>>
>> If you look at the output and terminate all strings at the embedded
>> \0, it looks much more sensible, so it should be fairly easy to spot
>> the cause of this bug...
>>
>
> --
> Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: random output with sub(fixed = TRUE)

Duncan Murdoch
In reply to this post by Roger D. Peng
On 12/21/2005 5:13 PM, Roger D. Peng wrote:

> Well, who am I to break this long-standing ritual? :)
>
> Interestingly, while the printed output looks wrong, I get
>
>  > v <- paste(0:10, "asdf", sep = ".")
>  > a <- sub(".asdf", "", v, fixed = TRUE)
>  > b <- as.character(0:10)
>  > identical(a, b)
> [1] TRUE
>  >
>
> -roger

I think finding two separate bugs on the day after the release goes a
bit beyond what is necessary to satisfy the ritual.

Duncan Murdoch

>
> Peter Dalgaard wrote:
>
>>"Roger D. Peng" <[hidden email]> writes:
>>
>>
>>
>>>I've noticed what I think is curious behavior in using 'sub(fixed = TRUE)' and
>>>was wondering if my expectation is incorrect.  Here is one example:
>>>
>>>v <- paste(0:10, "asdf", sep = ".")
>>>sub(".asdf", "", v, fixed = TRUE)
>>>
>>>The results I get are
>>>
>>>
>>>>sub(".asdf", "", v, fixed = TRUE)
>>>
>>> [1] "0"               "1\0st\0\0"       "2\0<af>\001\0\0" "3\0<af>\001\0\0"
>>> [5] "4\0mes\0"        "5\0<ba>\001\0\0" "6\0\0\0\0\0"     "7\0\0\0m\0"
>>> [9] "8\0\0\0t\0"      "9\0<fe>\0\0\0"   "10\0\0\0\0\0"
>>>
>>>I expected "0" in the first entry and everything else would be unchanged.  Your
>>>results may vary since every time I run 'sub()' in this way, I get a slightly
>>>different answer in entires 2 through 11.
>>>
>>>As it turns out, 'gsub(fixed = TRUE)' gives me the answer I *actually* wanted,
>>>which was to replace the string in every entry.  But I still think the behavior
>>>of 'sub(fixed = TRUE) is a bit odd.
>>>
>>>
>>>>version
>>>
>>>         _
>>>platform x86_64-unknown-linux-gnu
>>>arch     x86_64
>>>os       linux-gnu
>>>system   x86_64, linux-gnu
>>>status
>>>major    2
>>>minor    2.1
>>>year     2005
>>>month    12
>>>day      20
>>>svn rev  36812
>>>language R
>>>
>>
>>Argh...
>>
>>year     2005
>>month    12
>>day      21
>>
>>and something like this gets discovered. It's a ritual, I tell ya, a ritual!
>>
>>If you look at the output and terminate all strings at the embedded
>>\0, it looks much more sensible, so it should be fairly easy to spot
>>the cause of this bug...
>>
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel