collect data from the web

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

collect data from the web

Rmetrics mailing list
Hello everyone! Good Morning...
I would like some tips on how to get data from a web page [1] within R.
I made a few attempts, but I was unsuccessful because I have little
knowledge of the http / https protocol.
If anyone can provide any guidance, I would appreciate it very much.

My difficulty is because this page appears to be a front end that
outsources to several javascripts to the server that actually deliver
the data.

Thank you in advance!

Cleber Borges

[1] - https://bbtc.cma.com.br/topchart/?&pagetitle=IBOVESPA





        [[alternative HTML version deleted]]

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: collect data from the web

Ezra Tucker-2
Hm personally I can only give unhelpful help but I can give (1) some general pointers that have helped me, and (2) a possibly different approach.

What usually works for me when dealing with getting data from the web involves using the developer tools in a web browser. If you can see what requests your browser makes, and the server responses you get, you should be able to reproduce them using the httr package (or rcurl, etc) and parse the response using rvest.

That all being said, after a quick glance at this page, it looks like several iframes (web pages within web pages, basically) and a lot of javascript. Javascript can be tougher to get data out of in a programmatic kind of way, which leads me to option #2-- are there any other sources of these data that might be easier to deal with?

It *Looks* like you're trying to get stock price data for IBOVESPA - is this the same thing as ^BVSP ? if so, have you tried using the quantmod package?

library(quantmod)
getSymbols("^BVSP", src = "yahoo")  # this will automatically get the data, and assign it to a variable called "BVSP"
print(BVSP)

Hopefully that helps some.

--
[hidden email]
m: 818-203-0269
LinkedIn: linkedin.com/in/ezztucker
Github: github.com/minimenchmuncher

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Sunday, September 27, 2020 7:41 AM, Cleber N.Borges via R-SIG-Finance <[hidden email]> wrote:

> Hello everyone! Good Morning...
> I would like some tips on how to get data from a web page [1] within R.
> I made a few attempts, but I was unsuccessful because I have little
> knowledge of the http / https protocol.
> If anyone can provide any guidance, I would appreciate it very much.
>
> My difficulty is because this page appears to be a front end that
> outsources to several javascripts to the server that actually deliver
> the data.
>
> Thank you in advance!
>
> Cleber Borges
>
> [1] - https://bbtc.cma.com.br/topchart/?&pagetitle=IBOVESPA
>
> [[alternative HTML version deleted]]
>
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.