grep and XML

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

grep and XML

sjkiss
Hi all:
I struggle a lot scraping web data. I still haven't got a handle on the XML package.
I'd like to get particular exchange rates from this table:
https://raw.github.com/currencybot/open-exchange-rates/master/latest.json
This is the code that I'm working with:
library(RCurl)
library(XML)

txt<-getURL("https://raw.github.com/currencybot/open-exchange-rates/master/latest.json")
txt<-htmlParse(txt, asText=TRUE)
txt<-  getNodeSet(txt, '//p')
So, I can get the node, properly but then, if I try soething like this:
grep(c('USD'), txt)

I get:
integer(0)

Can anyone suggest a way forward?
Yours, Simon KIss

*********************************
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 905 746 7606

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: grep and XML

Henrique Dallazuanna
Try this:

library(rjson)
j <- fromJSON(file =
'https://raw.github.com/currencybot/open-exchange-rates/master/latest.json')
j$rates$USD

On Mon, Apr 16, 2012 at 6:03 PM, Simon Kiss <[hidden email]> wrote:

> Hi all:
> I struggle a lot scraping web data. I still haven't got a handle on the XML package.
> I'd like to get particular exchange rates from this table:
> https://raw.github.com/currencybot/open-exchange-rates/master/latest.json
> This is the code that I'm working with:
> library(RCurl)
> library(XML)
>
> txt<-getURL("https://raw.github.com/currencybot/open-exchange-rates/master/latest.json")
> txt<-htmlParse(txt, asText=TRUE)
> txt<-  getNodeSet(txt, '//p')
> So, I can get the node, properly but then, if I try soething like this:
> grep(c('USD'), txt)
>
> I get:
> integer(0)
>
> Can anyone suggest a way forward?
> Yours, Simon KIss
>
> *********************************
> Simon J. Kiss, PhD
> Assistant Professor, Wilfrid Laurier University
> 73 George Street
> Brantford, Ontario, Canada
> N3T 2C9
> Cell: +1 905 746 7606
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.