Extracting numeric part from a string

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Extracting numeric part from a string

Bogaso
Hi again,

I am struggling to extract the number part from below string :

"\"cm_ffm\":\"563.77\""

Basically, I need to extract 563.77 from above. The underlying number
can be a whole number, and there could be comma separator as well.

So far I tried below :

> library(stringr)

> str_extract("\"cm_ffm\":\"563.77\"", "[[:digit:]]+")

[1] "563"

>

However, above code is only extracting the integer part.

Could you please help how to achieve that. Thanks,

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Extracting numeric part from a string

Ismail SEZEN

> On 3 Aug 2017, at 02:59, Christofer Bogaso <[hidden email]> wrote:
>
> Hi again,
>
> I am struggling to extract the number part from below string :
>
> "\"cm_ffm\":\"563.77\""
>
> Basically, I need to extract 563.77 from above. The underlying number
> can be a whole number, and there could be comma separator as well.
>
> So far I tried below :
>
>> library(stringr)
>
>> str_extract("\"cm_ffm\":\"563.77\"", "[[:digit:]]+")
>
> [1] "563"
>
>>
>
> However, above code is only extracting the integer part.
>
> Could you please help how to achieve that. Thanks,


library(readr)
parse_number('"cm_ffm":"563.77”')

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Extracting numeric part from a string

Marc Schwartz-3
In reply to this post by Bogaso

> On Aug 2, 2017, at 6:59 PM, Christofer Bogaso <[hidden email]> wrote:
>
> Hi again,
>
> I am struggling to extract the number part from below string :
>
> "\"cm_ffm\":\"563.77\""
>
> Basically, I need to extract 563.77 from above. The underlying number
> can be a whole number, and there could be comma separator as well.
>
> So far I tried below :
>
>> library(stringr)
>
>> str_extract("\"cm_ffm\":\"563.77\"", "[[:digit:]]+")
>
> [1] "563"
>
>>
>
> However, above code is only extracting the integer part.
>
> Could you please help how to achieve that. Thanks,


Using ?gsub:

X <- "\"cm_ffm\":\"563.77\""

> gsub("[^0-9.]", "",  X)
[1] "563.77"

or

> gsub("[^[:digit:].]", "",  X)
[1] "563.77"


Basically, remove any characters that are not digits or the decimal point, presuming your pattern is consistent across your data.

Regards,

Marc Schwartz


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Extracting numeric part from a string

Marc Schwartz-3

> On Aug 2, 2017, at 7:42 PM, Marc Schwartz <[hidden email]> wrote:
>
>
>> On Aug 2, 2017, at 6:59 PM, Christofer Bogaso <[hidden email]> wrote:
>>
>> Hi again,
>>
>> I am struggling to extract the number part from below string :
>>
>> "\"cm_ffm\":\"563.77\""
>>
>> Basically, I need to extract 563.77 from above. The underlying number
>> can be a whole number, and there could be comma separator as well.
>>
>> So far I tried below :
>>
>>> library(stringr)
>>
>>> str_extract("\"cm_ffm\":\"563.77\"", "[[:digit:]]+")
>>
>> [1] "563"
>>
>>>
>>
>> However, above code is only extracting the integer part.
>>
>> Could you please help how to achieve that. Thanks,
>
>
> Using ?gsub:
>
> X <- "\"cm_ffm\":\"563.77\""
>
> > gsub("[^0-9.]", "",  X)
> [1] "563.77"
>
> or
>
> > gsub("[^[:digit:].]", "",  X)
> [1] "563.77"
>
>
> Basically, remove any characters that are not digits or the decimal point, presuming your pattern is consistent across your data.


Sorry, forgot that you indicated that there could be a comma:

X <- "\"cm_ffm\":\"1,563.77\""

> gsub("[^0-9.,]", "",  X)
[1] "1,563.77"

> gsub("[^[:digit:].,]", "",  X)
[1] "1,563.77"


Regards,

Marc

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Extracting numeric part from a string

Bert Gunter-2
In reply to this post by Ismail SEZEN
... Or if you just want to stick with basic regex's without extra packages:

> x <- "\"cm_ffm\":\"563.77\""
> sub("[^[:digit:]]*([[:digit:]]*.?[[:digit:]]*).*","\\1",x)

[1] "563.77"

Cheers,
Bert



On Wed, Aug 2, 2017 at 5:16 PM, Ismail SEZEN <[hidden email]> wrote:

>
>> On 3 Aug 2017, at 02:59, Christofer Bogaso <[hidden email]> wrote:
>>
>> Hi again,
>>
>> I am struggling to extract the number part from below string :
>>
>> "\"cm_ffm\":\"563.77\""
>>
>> Basically, I need to extract 563.77 from above. The underlying number
>> can be a whole number, and there could be comma separator as well.
>>
>> So far I tried below :
>>
>>> library(stringr)
>>
>>> str_extract("\"cm_ffm\":\"563.77\"", "[[:digit:]]+")
>>
>> [1] "563"
>>
>>>
>>
>> However, above code is only extracting the integer part.
>>
>> Could you please help how to achieve that. Thanks,
>
>
> library(readr)
> parse_number('"cm_ffm":"563.77”')
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Extracting numeric part from a string

Bert Gunter-2
... and Marc's solution is **much** better than mine.

-- Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Aug 2, 2017 at 5:59 PM, Bert Gunter <[hidden email]> wrote:

> ... Or if you just want to stick with basic regex's without extra packages:
>
>> x <- "\"cm_ffm\":\"563.77\""
>> sub("[^[:digit:]]*([[:digit:]]*.?[[:digit:]]*).*","\\1",x)
>
> [1] "563.77"
>
> Cheers,
> Bert
>
>
>
> On Wed, Aug 2, 2017 at 5:16 PM, Ismail SEZEN <[hidden email]> wrote:
>>
>>> On 3 Aug 2017, at 02:59, Christofer Bogaso <[hidden email]> wrote:
>>>
>>> Hi again,
>>>
>>> I am struggling to extract the number part from below string :
>>>
>>> "\"cm_ffm\":\"563.77\""
>>>
>>> Basically, I need to extract 563.77 from above. The underlying number
>>> can be a whole number, and there could be comma separator as well.
>>>
>>> So far I tried below :
>>>
>>>> library(stringr)
>>>
>>>> str_extract("\"cm_ffm\":\"563.77\"", "[[:digit:]]+")
>>>
>>> [1] "563"
>>>
>>>>
>>>
>>> However, above code is only extracting the integer part.
>>>
>>> Could you please help how to achieve that. Thanks,
>>
>>
>> library(readr)
>> parse_number('"cm_ffm":"563.77”')
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Extracting numeric part from a string

Marc Schwartz-3
Thanks Bert.

I should probably also explicitly mention that if Christofer wants to ultimately coerce the numeric components of the strings to numeric data types for subsequent mathematical operations, you will need to strip the commas anyway.

In that case, my first response, where I did not include the comma character in the regex may be preferred.

Regards,

Marc



> On Aug 2, 2017, at 8:00 PM, Bert Gunter <[hidden email]> wrote:
>
> ... and Marc's solution is **much** better than mine.
>
> -- Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Aug 2, 2017 at 5:59 PM, Bert Gunter <[hidden email]> wrote:
>> ... Or if you just want to stick with basic regex's without extra packages:
>>
>>> x <- "\"cm_ffm\":\"563.77\""
>>> sub("[^[:digit:]]*([[:digit:]]*.?[[:digit:]]*).*","\\1",x)
>>
>> [1] "563.77"
>>
>> Cheers,
>> Bert
>>
>>
>>
>> On Wed, Aug 2, 2017 at 5:16 PM, Ismail SEZEN <[hidden email]> wrote:
>>>
>>>> On 3 Aug 2017, at 02:59, Christofer Bogaso <[hidden email]> wrote:
>>>>
>>>> Hi again,
>>>>
>>>> I am struggling to extract the number part from below string :
>>>>
>>>> "\"cm_ffm\":\"563.77\""
>>>>
>>>> Basically, I need to extract 563.77 from above. The underlying number
>>>> can be a whole number, and there could be comma separator as well.
>>>>
>>>> So far I tried below :
>>>>
>>>>> library(stringr)
>>>>
>>>>> str_extract("\"cm_ffm\":\"563.77\"", "[[:digit:]]+")
>>>>
>>>> [1] "563"
>>>>
>>>>>
>>>>
>>>> However, above code is only extracting the integer part.
>>>>
>>>> Could you please help how to achieve that. Thanks,
>>>
>>>
>>> library(readr)
>>> parse_number('"cm_ffm":"563.77”')
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.