pad leading zeros in front of strings

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

pad leading zeros in front of strings

Hui Du
Dear All,

This question sounds very simple but I don't know where I am wrong. I just want to pad leading zeros in some string, for example, "123" becomes "00123". What is wrong if I do following?

> sprintf("%05s", "123")
[1] "  123"


It didn't return "00123", instead it padded with 'blank'.


Thank you for your help in advance.

HXD

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: pad leading zeros in front of strings

Michael Weylandt
I think once upon a time this was found to be OS-dependent since it
calls the system's C  sprintf()  -- I get the leading zeros on Mac. I
presume you're on Windows?

Michael

On Tue, May 22, 2012 at 2:41 PM, Hui Du <[hidden email]> wrote:

> Dear All,
>
> This question sounds very simple but I don't know where I am wrong. I just want to pad leading zeros in some string, for example, "123" becomes "00123". What is wrong if I do following?
>
>> sprintf("%05s", "123")
> [1] "  123"
>
>
> It didn't return "00123", instead it padded with 'blank'.
>
>
> Thank you for your help in advance.
>
> HXD
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: pad leading zeros in front of strings

Sarah Goslee
In reply to this post by Hui Du
Hi,

I think it's because " " is used to pad strings, while 0 is used to
pad numbers*. If your values are always numeric, but stored as
strings, you could use:

> x <- "123"
> sprintf("%05d", as.numeric(x))
[1] "00123"


* From ?sprintf:
    ‘0’ For numbers, pad to the field width with leading zeros.

I think some language implementations allow for specifying different
pad characters, but R's doesn't seem to.

Sarah

On Tue, May 22, 2012 at 2:41 PM, Hui Du <[hidden email]> wrote:

> Dear All,
>
> This question sounds very simple but I don't know where I am wrong. I just want to pad leading zeros in some string, for example, "123" becomes "00123". What is wrong if I do following?
>
>> sprintf("%05s", "123")
> [1] "  123"
>
>
> It didn't return "00123", instead it padded with 'blank'.
>
>
> Thank you for your help in advance.
>
> HXD
>
>

--
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: pad leading zeros in front of strings

Michael Weylandt
In reply to this post by Michael Weylandt
Please do reply to the list -- I'm not on Windows so someone else will
have to pick the question up to help you out.

It's not great, but you could do something like

zeroPad <- function(str, len.out, num.zeros = len.out[1] - nchar(str)){
   paste0(paste(rep("0", num.zeros), collapse = ""), str)
}

as a temporary work-around. Probably possible to vectorize that pretty
easily as well.

Best,
Michael

On Tue, May 22, 2012 at 2:50 PM, Hui Du <[hidden email]> wrote:

> Thank you for your replay. Yes, I am on windows.
>
> Best Regards,
> Hui Du
>
> Data Ventures Inc
>
> -----Original Message-----
> From: R. Michael Weylandt [mailto:[hidden email]]
> Sent: Tuesday, May 22, 2012 11:49 AM
> To: Hui Du
> Cc: [hidden email]
> Subject: Re: [R] pad leading zeros in front of strings
>
> I think once upon a time this was found to be OS-dependent since it
> calls the system's C  sprintf()  -- I get the leading zeros on Mac. I
> presume you're on Windows?
>
> Michael
>
> On Tue, May 22, 2012 at 2:41 PM, Hui Du <[hidden email]> wrote:
>> Dear All,
>>
>> This question sounds very simple but I don't know where I am wrong. I just want to pad leading zeros in some string, for example, "123" becomes "00123". What is wrong if I do following?
>>
>>> sprintf("%05s", "123")
>> [1] "  123"
>>
>>
>> It didn't return "00123", instead it padded with 'blank'.
>>
>>
>> Thank you for your help in advance.
>>
>> HXD
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: pad leading zeros in front of strings

Sarah Goslee
In reply to this post by Michael Weylandt
Michael, I'm curious: if you pass sprintf() a string, it still pads
with zeros? What's the output of:

sprintf("%05s", "123")
sprintf("%05s", "abc")

On linux, sprintf() pads strings with spaces, as you'd expect. Padding
strings with zeros is... odd.

Sarah

> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-redhat-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.15.0

On Tue, May 22, 2012 at 2:49 PM, R. Michael Weylandt
<[hidden email]> wrote:

> I think once upon a time this was found to be OS-dependent since it
> calls the system's C  sprintf()  -- I get the leading zeros on Mac. I
> presume you're on Windows?
>
> Michael
>
> On Tue, May 22, 2012 at 2:41 PM, Hui Du <[hidden email]> wrote:
>> Dear All,
>>
>> This question sounds very simple but I don't know where I am wrong. I just want to pad leading zeros in some string, for example, "123" becomes "00123". What is wrong if I do following?
>>
>>> sprintf("%05s", "123")
>> [1] "  123"
>>
>>
>> It didn't return "00123", instead it padded with 'blank'.
>>
>>
>> Thank you for your help in advance.
>>
>> HXD
>>

--
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: pad leading zeros in front of strings

Michael Weylandt
I get "00123"  and "00abc" respectively. Agreed it's perhaps odd, but
c'est la OS.

M

On Tue, May 22, 2012 at 2:57 PM, Sarah Goslee <[hidden email]> wrote:

> Michael, I'm curious: if you pass sprintf() a string, it still pads
> with zeros? What's the output of:
>
> sprintf("%05s", "123")
> sprintf("%05s", "abc")
>
> On linux, sprintf() pads strings with spaces, as you'd expect. Padding
> strings with zeros is... odd.
>
> Sarah
>
>> sessionInfo()
> R version 2.15.0 (2012-03-30)
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=C                 LC_NAME=C
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] tools_2.15.0
>
> On Tue, May 22, 2012 at 2:49 PM, R. Michael Weylandt
> <[hidden email]> wrote:
>> I think once upon a time this was found to be OS-dependent since it
>> calls the system's C  sprintf()  -- I get the leading zeros on Mac. I
>> presume you're on Windows?
>>
>> Michael
>>
>> On Tue, May 22, 2012 at 2:41 PM, Hui Du <[hidden email]> wrote:
>>> Dear All,
>>>
>>> This question sounds very simple but I don't know where I am wrong. I just want to pad leading zeros in some string, for example, "123" becomes "00123". What is wrong if I do following?
>>>
>>>> sprintf("%05s", "123")
>>> [1] "  123"
>>>
>>>
>>> It didn't return "00123", instead it padded with 'blank'.
>>>
>>>
>>> Thank you for your help in advance.
>>>
>>> HXD
>>>
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: pad leading zeros in front of strings

Hui Du

Thanks all. I am trying to cleaning up user-inputted zip code. Most of them are pure numeric values but someone put characters in zipcode field, so I have to treat that field as a string rather than number.

HXD

-----Original Message-----
From: R. Michael Weylandt [mailto:[hidden email]]
Sent: Tuesday, May 22, 2012 12:00 PM
To: Sarah Goslee
Cc: Hui Du; [hidden email]
Subject: Re: [R] pad leading zeros in front of strings

I get "00123"  and "00abc" respectively. Agreed it's perhaps odd, but
c'est la OS.

M

On Tue, May 22, 2012 at 2:57 PM, Sarah Goslee <[hidden email]> wrote:

> Michael, I'm curious: if you pass sprintf() a string, it still pads
> with zeros? What's the output of:
>
> sprintf("%05s", "123")
> sprintf("%05s", "abc")
>
> On linux, sprintf() pads strings with spaces, as you'd expect. Padding
> strings with zeros is... odd.
>
> Sarah
>
>> sessionInfo()
> R version 2.15.0 (2012-03-30)
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=C                 LC_NAME=C
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] tools_2.15.0
>
> On Tue, May 22, 2012 at 2:49 PM, R. Michael Weylandt
> <[hidden email]> wrote:
>> I think once upon a time this was found to be OS-dependent since it
>> calls the system's C  sprintf()  -- I get the leading zeros on Mac. I
>> presume you're on Windows?
>>
>> Michael
>>
>> On Tue, May 22, 2012 at 2:41 PM, Hui Du <[hidden email]> wrote:
>>> Dear All,
>>>
>>> This question sounds very simple but I don't know where I am wrong. I just want to pad leading zeros in some string, for example, "123" becomes "00123". What is wrong if I do following?
>>>
>>>> sprintf("%05s", "123")
>>> [1] "  123"
>>>
>>>
>>> It didn't return "00123", instead it padded with 'blank'.
>>>
>>>
>>> Thank you for your help in advance.
>>>
>>> HXD
>>>
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: pad leading zeros in front of strings

Sarah Goslee
On Tue, May 22, 2012 at 3:12 PM, Hui Du <[hidden email]> wrote:
>
> Thanks all. I am trying to cleaning up user-inputted zip code. Most of them are pure numeric values but someone put characters in zipcode field, so I have to treat that field as a string rather than number.

But a padded invalid zip code, is still an invalid zip code. So why
not fix them first, then pad them? grepl() would be useful to identify
which have non-numeric characters.

Sarah



--
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: pad leading zeros in front of strings

Rui Barradas
In reply to this post by Michael Weylandt
Hello,

I believe there's nothing OS dependent about this:

# pad with zeros
padz <- function(x, n=max(nchar(x))) gsub(" ", "0", formatC(x, width=n))

padz(c(1, 10, 100), 5)
padz(c("a", "aa"))
padz(c("a", "aa"), 5)

Rui Barradas
Michael Weylandt wrote
I think once upon a time this was found to be OS-dependent since it
calls the system's C  sprintf()  -- I get the leading zeros on Mac. I
presume you're on Windows?

Michael

On Tue, May 22, 2012 at 2:41 PM, Hui Du <[hidden email]> wrote:
> Dear All,
>
> This question sounds very simple but I don't know where I am wrong. I just want to pad leading zeros in some string, for example, "123" becomes "00123". What is wrong if I do following?
>
>> sprintf("%05s", "123")
> [1] "  123"
>
>
> It didn't return "00123", instead it padded with 'blank'.
>
>
> Thank you for your help in advance.
>
> HXD
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.