# Sorting strings

## Sorting strings

## Re: Sorting strings

 On Mon, Feb 20, 2012 at 02:18:42AM -0800, statquant2 wrote: > Hi all, I am having difficulties to understand how R sort strings: > > If I do > R) sort(c("X.","X0B")) > [1] "X."  "X0B" > > So for me, as far as lexicographic order is concerned I can add whatever to > the end, the order will remain the same, but : Hi. This neednot be true for strings of different length. For example   ab   abc become by concatenation with z   abcz   abz Petr Savicky.
## Re: Sorting strings

 "Petr Savicky" <[hidden email]> wrote in message news:[hidden email]... > On Mon, Feb 20, 2012 at 02:18:42AM -0800, statquant2 wrote: >> Hi all, I am having difficulties to understand how R sort strings: >> >> If I do >> R) sort(c("X.","X0B")) >> [1] "X."  "X0B" >> >> So for me, as far as lexicographic order is concerned I can add whatever >> to >> the end, the order will remain the same, but : > > Hi. > > This neednot be true for strings of different length. > For example > >  ab >  abc > > become by concatenation with z > >  abcz >  abz > > Petr Savicky. > That's not the explanation in this case. The OP isn't telling us everything. I get [R version 2.14.1 Platform: i386-pc-mingw32/i386 (32-bit)]: > sort(c("X.","X0B")) [1] "X."  "X0B" > sort(c("X.Z","X0B.Z")) [1] "X.Z"   "X0B.Z" KJ
## Re: Sorting strings

## Re: Sorting strings

 In reply to this post by Petr Savicky See ?Comparison, which holds some warnings about what to expect when sorting strings. Am 20.02.2012 11:51, schrieb Petr Savicky: > On Mon, Feb 20, 2012 at 02:18:42AM -0800, statquant2 wrote: >> Hi all, I am having difficulties to understand how R sort strings: >> >> If I do >> R) sort(c("X.","X0B")) >> [1] "X."  "X0B" >> >> So for me, as far as lexicographic order is concerned I can add whatever to >> the end, the order will remain the same, but : > > Hi. > > This neednot be true for strings of different length. > For example > >    ab >    abc > > become by concatenation with z > >    abcz >    abz > > Petr Savicky. -- Enrico Schumann Lucerne, Switzerland http://nmof.net/
## Re: Sorting strings

## Re: Sorting strings

 I don't *think* it's version specific, but rather it depends on your (still unstated) locale, as the documentation goes to great lengths to point out. Change that and you might see different behaviors. Michael On Mon, Feb 20, 2012 at 8:55 AM, statquant2 <[hidden email]> wrote: > I did, but this does not give the answer to my question... > Anybody knows how to tweack the behaviour of sort or how to do ?
## Re: Sorting strings

 Hello, statquant2 wrote Ok so it changed from 2.12.2 to 2.14.1 ?? Can somebody tell me how to modify my sort or whatever to get the save resilt that I would get in 2.14.1 ? Cheers I don't know about 2.12.2 but for 2.12.0 I get: > R.version                _                             platform       i386-pc-mingw32               arch           i386                         os             mingw32                       system         i386, mingw32                 status                                       major          2                             minor          12.0                         year           2010                         month          10                           day            15                           svn rev        53317                         language       R                             version.string R version 2.12.0 (2010-10-15) > sort(c("X.","X0B")) [1] "X."  "X0B" > sort(c("X.Z","X0B.Z")) [1] "X.Z"   "X0B.Z" And the same for 2.14.1: > R.version                _                             platform       i386-pc-mingw32 [... deleted...] version.string R version 2.14.1 (2011-12-22) > sort(c("X.","X0B")) [1] "X."  "X0B" > sort(c("X.Z","X0B.Z")) [1] "X.Z"   "X0B.Z" Could it be OS related? Rui Barradas.
## Re: Sorting strings

## Re: Sorting strings

 In reply to this post by statquant2 On Mon, Feb 20, 2012 at 05:55:30AM -0800, statquant2 wrote: > I did, but this does not give the answer to my question... > Anybody knows how to tweack the behaviour of sort or how to do ? Hi. Try this   Sys.setlocale("LC_COLLATE", "C") This comes from ?locale and reads there      Sys.setlocale("LC_COLLATE", "C")   # turn off locale-specific sorting,                                         #  usually See also ?sort      The sort order for character vectors will depend on the collating      sequence of the locale in use: see 'Comparison'. ?Comparison      Comparison of strings in character vectors is lexicographic within      the strings using the collating sequence of the locale in use: see      'locales'.  The collating sequence of locales such as 'en_US' is      normally different from 'C' (which should use ASCII) and can be      surprising.  Beware of making _any_ assumptions about the      collation order: ... Hope this helps. Petr Savicky.
## Re: Sorting strings

## Re: Sorting strings

 In reply to this post by statquant2 It seems OS-dependent. I got different results when trying it on windows xp and Redhat linux.  > R.version                 _ platform       x86_64-unknown-linux-gnu arch           x86_64 os             linux-gnu system         x86_64, linux-gnu status major          2 minor          9.1 year           2009 month          06 day            26 svn rev        48839 language       R version.string R version 2.9.1 (2009-06-26)  > sort(c("X.","X0B")) [1] "X."  "X0B"  > sort(c("X.Z","X0B.Z")) [1] "X.Z"   "X0B.Z"  > R.version                 _ platform       x86_64-unknown-linux-gnu arch           x86_64 os             linux-gnu system         x86_64, linux-gnu status major          2 minor          9.1 year           2009 month          06 day            26 svn rev        48839 language       R version.string R version 2.9.1 (2009-06-26)  > sort(c("X.","X0B")) [1] "X."  "X0B"  > sort(c("X.Z","X0B.Z")) [1] "X0B.Z" "X.Z" On 2012-2-20 23:27, statquant2 wrote: > Ok I have : > > R) str(R.Version()) > List of 13 >   \$ platform      : chr "x86_64-unknown-linux-gnu" >   \$ arch          : chr "x86_64" >   \$ os            : chr "linux-gnu" >   \$ system        : chr "x86_64, linux-gnu" >   \$ status        : chr "" >   \$ major         : chr "2" >   \$ minor         : chr "12.2" >   \$ year          : chr "2011" >   \$ month         : chr "02" >   \$ day           : chr "25" >   \$ svn rev       : chr "54585" >   \$ language      : chr "R" >   \$ version.string: chr "R version 2.12.2 (2011-02-25)" > > R) sort(c("X.","X0B")) > [1] "X."  "X0B" > R) sort(c("X.Z","X0B.Z")) > [1] "X0B.Z" "X.Z" > > I am using a linux redHat > \$ uname -a > Linux 2.6.18-238.9.1.el5 #1 SMP Fri Mar 18 12:42:39 EDT 2011 x86_64 x86_64 > x86_64 GNU/Linux
## Re: Sorting strings

 In reply to this post by Petr Savicky On Mon, Feb 20, 2012 at 04:56:21PM +0100, Petr Savicky wrote: > On Mon, Feb 20, 2012 at 05:55:30AM -0800, statquant2 wrote: > > I did, but this does not give the answer to my question... > > Anybody knows how to tweack the behaviour of sort or how to do ? > > Hi. > > Try this > >   Sys.setlocale("LC_COLLATE", "C") > > > This comes from ?locale and reads there This is not in ?locale, but in ?locales >      Sys.setlocale("LC_COLLATE", "C")   # turn off locale-specific sorting, >                                         #  usually This in the example section at the end. Try also to see   Sys.getlocale() Relevant can also be LC_CTYPE   Sys.setlocale("LC_CTYPE", "C") Hope this helps. Petr Savicky.
## Re: Sorting strings

 In reply to this post by De-Jian Zhao Sorry, just made a mistake. This is the result from windows xp.  > sort(c("X.","X0B")) [1] "X."  "X0B"  > sort(c("X.Z","X0B.Z")) [1] "X.Z"   "X0B.Z"  > R.version                 _ platform       i386-pc-mingw32 arch           i386 os             mingw32 system         i386, mingw32 status major          2 minor          13.0 year           2011 month          04 day            13 svn rev        55427 language       R version.string R version 2.13.0 (2011-04-13) On 2012-2-21 0:13, De-Jian Zhao wrote: > It seems OS-dependent. I got different results when trying it on > windows xp and Redhat linux. > > > > R.version >                _ > platform       x86_64-unknown-linux-gnu > arch           x86_64 > os             linux-gnu > system         x86_64, linux-gnu > status > major          2 > minor          9.1 > year           2009 > month          06 > day            26 > svn rev        48839 > language       R > version.string R version 2.9.1 (2009-06-26) > > sort(c("X.","X0B")) > [1] "X."  "X0B" > > sort(c("X.Z","X0B.Z")) > [1] "X.Z"   "X0B.Z" > > > > R.version >                _ > platform       x86_64-unknown-linux-gnu > arch           x86_64 > os             linux-gnu > system         x86_64, linux-gnu > status > major          2 > minor          9.1 > year           2009 > month          06 > day            26 > svn rev        48839 > language       R > version.string R version 2.9.1 (2009-06-26) > > sort(c("X.","X0B")) > [1] "X."  "X0B" > > sort(c("X.Z","X0B.Z")) > [1] "X0B.Z" "X.Z" > > > On 2012-2-20 23:27, statquant2 wrote: >> Ok I have : >> >> R) str(R.Version()) >> List of 13 >>   \$ platform      : chr "x86_64-unknown-linux-gnu" >>   \$ arch          : chr "x86_64" >>   \$ os            : chr "linux-gnu" >>   \$ system        : chr "x86_64, linux-gnu" >>   \$ status        : chr "" >>   \$ major         : chr "2" >>   \$ minor         : chr "12.2" >>   \$ year          : chr "2011" >>   \$ month         : chr "02" >>   \$ day           : chr "25" >>   \$ svn rev       : chr "54585" >>   \$ language      : chr "R" >>   \$ version.string: chr "R version 2.12.2 (2011-02-25)" >> >> R) sort(c("X.","X0B")) >> [1] "X."  "X0B" >> R) sort(c("X.Z","X0B.Z")) >> [1] "X0B.Z" "X.Z" >> >> I am using a linux redHat >> \$ uname -a >> Linux 2.6.18-238.9.1.el5 #1 SMP Fri Mar 18 12:42:39 EDT 2011 x86_64 >> x86_64 >> x86_64 GNU/Linux