DBI solution

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

DBI solution

Ben Nachtrieb
Hello,

I'm building (from the ground up) a PostgreSQL time series database for use
with R. I'd like to get some preliminary guidance (help me get pointed in
the correct direction).

Here are some details: I am PC/Windows centric. I have data of all
frequencies equal to or greater than Daily (no inter-day data). We have
prices and technical data, macro data, company descriptive, and financial
data. I don't have to do any 'DBA' stuff per-say (no messy stuff like split
adjustments, ticker changes, etc.) as that is done for me; however, I will
have to adjust for look-ahead, create new factor values, transform, etc.
the data that we have and store it in database form. Can someone point me
to the best solution/packages for this given that I have to stay in the R
and PostgreSQL world?

I see RpgSQL, RposgreSQL, TSPostgreSQL, DBI, TSdbi, and much more... I am
hoping someone can narrow things down for me.

Thanks so much!
--
Ben

        [[alternative HTML version deleted]]

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: DBI solution

Gabor Grothendieck
On Fri, Nov 4, 2011 at 7:08 PM, Ben Nachtrieb <[hidden email]> wrote:

> Hello,
>
> I'm building (from the ground up) a PostgreSQL time series database for use
> with R. I'd like to get some preliminary guidance (help me get pointed in
> the correct direction).
>
> Here are some details: I am PC/Windows centric. I have data of all
> frequencies equal to or greater than Daily (no inter-day data). We have
> prices and technical data, macro data, company descriptive, and financial
> data. I don't have to do any 'DBA' stuff per-say (no messy stuff like split
> adjustments, ticker changes, etc.) as that is done for me; however, I will
> have to adjust for look-ahead, create new factor values, transform, etc.
> the data that we have and store it in database form. Can someone point me
> to the best solution/packages for this given that I have to stay in the R
> and PostgreSQL world?
>
> I see RpgSQL, RposgreSQL, TSPostgreSQL, DBI, TSdbi, and much more... I am
> hoping someone can narrow things down for me.
>
> Thanks so much!

RpgSQL and RPostgreSQL are DBI-based drivers for R and the others are
higher level packages.

I can only address my own package but RpgSQL was developed for use
with sqldf.  It uses RJDBC (which uses JDBC and Java) and DBI.
Feedback is that RpgSQL is easier to install on Windows than
alternatives.  The use of RJDBC/JDBC may slow it down relative to
alternatives although speed may not be material if you are just
importing a bunch of time series and then working with them in R as
opposed to constantly going back to the database.

sqldf is a package that lets you use R data frames with several
database back-ends including RpgSQL.  You would not use sqldf with
your own database but might use it with R data frames to play around
with PostgreSQL.

Resources are

- the Installation info in this link:
  http://cran.r-project.org/web/packages/RpgSQL/index.html

- ?pgSQL help page (see example at end of page)

- http://sqldf.googlecode.com
  most of which is about sqldf and sqlite although much of it applies
and there is a bit about using  PostgreSQL with sqldf.

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: DBI solution

Paul Gilbert
In reply to this post by Ben Nachtrieb
Ben

(Possibly too late to be preliminary guidance, but ...)

TSdbi is the base package in a group of packages, including TSPostgreSQL, which try to provide a common interface (API) to time series databases. That is, you specify the connection, and after that all of your R code syntax can be the same, and does not depend on the specifics of the underlying mechanism. These packages are almost all wrappers for other packages (eg RPostgreSQL), so the main benefits to using them rather than the underlying packages are that they provide a common interface, and a mechanism for returning a specified time series representation. (For example, the fame package returns tis series, but TSfame handles conversion and allows the possibility of returning other representations like zoo series.)

The SQL variants (TSPostgreSQL, TSMySQL, TSSQLite, TSodbc, and untested TSOracle) include table structure definitions for the database. The database does not need to be built with R, but the table structure needs to be respected for the TS* SQL variants to work.

The non-SQL variants (TSfame, TSxls, TSgetSymbol, TShistQuote, ...) pull data from other sources, mostly the web  except for TSfame.

With the SQL tables I believe series of daily frequency and lower are handled fairly well. (I work mostly with monthly and quarterly data, but also use daily and weekly data.) In theory, tick data (time stamped series) are also handled, but I have never work with that kind of data, so it is not well tested.

The tables also provide a mechanism for storing meta data descriptions of series, so you can store company descriptions, but there is no SQL structure within the description. That is, you could not do a SQL query to select certain types of businesses based on the description. (This would probably not be too difficult to implement, but it is not in the structure provided.)

There is a not very well tested mechanism for handling series name changes (by an alias). I believe this could be used for ticker changes, but I'm not sure.

Being an economist rather than a financial person, I'm not exactly sure what you mean by "adjust for look-ahead, create new factor values, transform, etc." The database stores the time series data, but these things sound like the sort of thing I would do in R rather than in the database.

If you already have a backend SQL database, and are just building the interface not building the database, then the TSdbi package has a function TSquery that may be useful. I use this to construct time series from a relational SQL database that was built for purposes other than storing time series.

I do not work much in Windows, but the TS* packages should work without problem, as long as the underlying packages work in Windows. Thus, you need the PostgreSQL drivers to install RPostgreSQL, and then everything should work.

HTH,
Paul

> -----Original Message-----
> From: [hidden email] [mailto:r-sig-finance-
> [hidden email]] On Behalf Of Ben Nachtrieb
> Sent: November 4, 2011 7:08 PM
> To: [hidden email]
> Subject: [R-SIG-Finance] DBI solution
>
> Hello,
>
> I'm building (from the ground up) a PostgreSQL time series database for
> use
> with R. I'd like to get some preliminary guidance (help me get pointed
> in
> the correct direction).
>
> Here are some details: I am PC/Windows centric. I have data of all
> frequencies equal to or greater than Daily (no inter-day data). We have
> prices and technical data, macro data, company descriptive, and
> financial
> data. I don't have to do any 'DBA' stuff per-say (no messy stuff like
> split
> adjustments, ticker changes, etc.) as that is done for me; however, I
> will
> have to adjust for look-ahead, create new factor values, transform,
> etc.
> the data that we have and store it in database form. Can someone point
> me
> to the best solution/packages for this given that I have to stay in the
> R
> and PostgreSQL world?
>
> I see RpgSQL, RposgreSQL, TSPostgreSQL, DBI, TSdbi, and much more... I
> am
> hoping someone can narrow things down for me.
>
> Thanks so much!
> --
> Ben
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions
> should go.
====================================================================================

La version française suit le texte anglais.

------------------------------------------------------------------------------------

This email may contain privileged and/or confidential information, and the Bank of
Canada does not waive any related rights. Any distribution, use, or copying of this
email or the information it contains by other than the intended recipient is
unauthorized. If you received this email in error please delete it immediately from
your system and notify the sender promptly by email that you have done so.

------------------------------------------------------------------------------------

Le présent courriel peut contenir de l'information privilégiée ou confidentielle.
La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute diffusion,
utilisation ou copie de ce courriel ou des renseignements qu'il contient par une
personne autre que le ou les destinataires désignés est interdite. Si vous recevez
ce courriel par erreur, veuillez le supprimer immédiatement et envoyer sans délai à
l'expéditeur un message électronique pour l'aviser que vous avez éliminé de votre
ordinateur toute copie du courriel reçu.
_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: DBI solution

rkevinburton
In reply to this post by Ben Nachtrieb
Is there a Windows SQL Server variant?


On Thu, Nov 10, 2011 at 12:28 PM, Paul Gilbert wrote:

> Ben
>
> (Possibly too late to be preliminary guidance, but ...)
>
> TSdbi is the base package in a group of packages, including
> TSPostgreSQL, which try to provide a common interface (API) to time
> series databases. That is, you specify the connection, and after that
> all of your R code syntax can be the same, and does not depend on the
> specifics of the underlying mechanism. These packages are almost all
> wrappers for other packages (eg RPostgreSQL), so the main benefits to
> using them rather than the underlying packages are that they provide a
> common interface, and a mechanism for returning a specified time
> series representation. (For example, the fame package returns tis
> series, but TSfame handles conversion and allows the possibility of
> returning other representations like zoo series.)
>
> The SQL variants (TSPostgreSQL, TSMySQL, TSSQLite, TSodbc, and
> untested TSOracle) include table structure definitions for the
> database. The database does not need to be built with R, but the table
> structure needs to be respected for the TS* SQL variants to work.
>
> The non-SQL variants (TSfame, TSxls, TSgetSymbol, TShistQuote, ...)
> pull data from other sources, mostly the web  except for TSfame.
> With the SQL tables I believe series of daily frequency and lower are
> handled fairly well. (I work mostly with monthly and quarterly data,
> but also use daily and weekly data.) In theory, tick data (time
> stamped series) are also handled, but I have never work with that kind
> of data, so it is not well tested.
>
> The tables also provide a mechanism for storing meta data descriptions
> of series, so you can store company descriptions, but there is no SQL
> structure within the description. That is, you could not do a SQL
> query to select certain types of businesses based on the description.
> (This would probably not be too difficult to implement, but it is not
> in the structure provided.)
>
> There is a not very well tested mechanism for handling series name
> changes (by an alias). I believe this could be used for ticker
> changes, but I'm not sure.
>
> Being an economist rather than a financial person, I'm not exactly
> sure what you mean by "adjust for look-ahead, create new factor
> values, transform, etc." The database stores the time series data, but
> these things sound like the sort of thing I would do in R rather than
> in the database.
>
> If you already have a backend SQL database, and are just building the
> interface not building the database, then the TSdbi package has a
> function TSquery that may be useful. I use this to construct time
> series from a relational SQL database that was built for purposes
> other than storing time series.
>
> I do not work much in Windows, but the TS* packages should work
> without problem, as long as the underlying packages work in Windows.
> Thus, you need the PostgreSQL drivers to install RPostgreSQL, and then
> everything should work.
>
> HTH,
> Paul
>
>> -----Original Message-----
>> From: [hidden email] [mailto:r-sig-finance-
>> [hidden email]] On Behalf Of Ben Nachtrieb
>> Sent: November 4, 2011 7:08 PM
>> To: [hidden email]
>> Subject: [R-SIG-Finance] DBI solution
>>
>> Hello,
>>
>> I'm building (from the ground up) a PostgreSQL time series database
>> for
>> use
>> with R. I'd like to get some preliminary guidance (help me get
>> pointed
>> in
>> the correct direction).
>>
>> Here are some details: I am PC/Windows centric. I have data of all
>> frequencies equal to or greater than Daily (no inter-day data). We
>> have
>> prices and technical data, macro data, company descriptive, and
>> financial
>> data. I don't have to do any 'DBA' stuff per-say (no messy stuff like
>> split
>> adjustments, ticker changes, etc.) as that is done for me; however, I
>> will
>> have to adjust for look-ahead, create new factor values, transform,
>> etc.
>> the data that we have and store it in database form. Can someone
>> point
>> me
>> to the best solution/packages for this given that I have to stay in
>> the
>> R
>> and PostgreSQL world?
>>
>> I see RpgSQL, RposgreSQL, TSPostgreSQL, DBI, TSdbi, and much more...
>> I
>> am
>> hoping someone can narrow things down for me.
>>
>> Thanks so much!
>> --
>> Ben
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only. If you want to post, subscribe first.
>> -- Also note that this is not the r-help list where general R
>> questions
>> should go.
>
> ====================================================================================
>
> La version française suit le texte anglais.
>
>
> ------------------------------------------------------------------------------------
>
> This email may contain privileged and/or confidential information, and
> the Bank of
> Canada does not waive any related rights. Any distribution, use, or
> copying of this
> email or the information it contains by other than the intended
> recipient is
> unauthorized. If you received this email in error please delete it
> immediately from
> your system and notify the sender promptly by email that you have done
> so.
>
> ------------------------------------------------------------------------------------
>
> Le présent courriel peut contenir de l'information privilégiée ou
> confidentielle.
> La Banque du Canada ne renonce pas aux droits qui s'y rapportent.
> Toute diffusion,
> utilisation ou copie de ce courriel ou des renseignements qu'il
> contient par une
> personne autre que le ou les destinataires désignés est interdite. Si
> vous recevez
> ce courriel par erreur, veuillez le supprimer immédiatement et envoyer
> sans délai à
> l'expéditeur un message électronique pour l'aviser que vous avez
> éliminé de votre
> ordinateur toute copie du courriel reçu.
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R
> questions should go.

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.