Quantcast

IBrokers - reqHistory results in missing random data

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

IBrokers - reqHistory results in missing random data

algotr8der
Has anyone come across situations where the export of historical data from IB's API has missing data points that occur randomly?

I did the following to download data for the Select SPDR etfs:

> tws <- twsConnect()
> contract <- twsEquity('XHB','SMART','ISLAND')
> reqHistory(tws, Contract=contract) -> XHB

You can split the data by day doing the following:

> split.xts(XHB, f="days") -> XHBsplit

Then you can cycle through each list inside XHBsplit and determine the number of data points by doing a dim(). There should be 390 1 minute data points on full trading days (note: you can ignore the half day after Thanksgiving as I would delete this day from the record anyhow as I'm not interested in trading half days).

By default the above retrieval is set to retrieve 1 years worth of minute data. I received all of the data for XLE but noticed that XHB had fewer data points. I pulled up the chart of XHB to examine whether those missing data points showed up on the chart but all was well on the TWS Chart. As per IB, the backfilling on the TWS Chart uses the same data export framework as that used by reqHistoricalData. So I decided to re-download XHB. This time the missing data points were different from the previous download.

I used IB's TswDde Excel file to cross verify and I noticed that the data is present using the Excel API. It could be that the problem did not surface because TwsDde limits the export of 1 minute data to 2 days worth of data. I'm speculating here but I do know that downloading via reqHistory produces data with missing data points that appear to occur randomly.

The other thing I noticed was that the data pulled by reqHistory begins at 9:30:00 and ends at 15:59:00 while the same using TswDde begins at 9:31:00 and ends at 16:00:00.

2010-06-18 15:58:00    15.85    15.85   15.84     15.84       1569  15.843           0       337
2010-06-18 15:59:00    15.84    15.85   15.81     15.81       3518  15.828           0       527
2010-06-21 09:30:00    16.04    16.10   16.03     16.09        240  16.047           0        47
2010-06-21 09:31:00    16.09    16.09   16.07     16.08        226  16.081           0       119
2010-05-17 09:31:00    18.00    18.03   18.00     18.02        115  18.020           0        39
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: IBrokers - reqHistory results in missing random data

Jeffrey Ryan-2
On Wed, May 18, 2011 at 2:33 PM, algotr8der <[hidden email]> wrote:

> Has anyone come across situations where the export of historical data from
> IB's API has missing data points that occur randomly?
>
> I'm not too sure about 'randomly' but I have seen something similar in
terms of missing dates/times/contracts on occasion.


> I did the following to download data for the Select SPDR etfs:
>
> > tws <- twsConnect()
> > contract <- twsEquity('XHB','SMART','ISLAND')
> > reqHistory(tws, Contract=contract) -> XHB
>
> By default this is set to retrieve 1 years worth of minute data. I received
> all of the data for XLE but noticed that XHB has fewer data points. I
> pulled
> up the chart of XHB to examine whether those missing data points showed up
> on the chart but all was well on the TWS Chart. As per IB, the backfilling
> on the TWS Chart uses the same data export framework as that used by
> reqHistoricalData. So I decided to re-download XHB. This time the missing
> data points were different from the previous download.
>

reqHistory isn't much more than an lapply over/around the max download limit
per call.  Maybe you could send me off-list the output of your request to
see if I get the same issue.  Another thing to help debug is to run this on
the IBGateway - and send me a copy of the log file.  setServerLogLevel(tws,
5) might do the same as well.

I also would argue with IB that they aren't using the same framework for the
backfills - since you can do more in the TWS than the API allows - something
*is* different even at the user level.

>
> I used IB's TswDde Excel file to cross verify and I noticed that the data
> is
> present using the Excel API. It could be that the problem did not surface
> because TwsDde limits the export of 1 minute data to 2 days worth of data.
> I'm speculating here but I do know that downloading via reqHistory produces
> data with missing data points that occur randomly.
>

The excel variant uses ActiveX - and I suspect it isn't really the same as
the socket version (Java, IBrokers, etc).  Test using the distributed Java
example program (or write one).  That would be more apples to apples.

>
> The other thing I noticed was that the data pulled by reqHistory begins at
> 9:30:00 and ends at 15:59:00 while the same using TswDde begins at 9:31:00
> and ends at 16:00:00.
>
> 2010-06-18 15:58:00    15.85    15.85   15.84     15.84       1569  15.843
> 0       337
> 2010-06-18 15:59:00    15.84    15.85   15.81     15.81       3518  15.828
> 0       527
> 2010-06-21 09:30:00    16.04    16.10   16.03     16.09        240  16.047
> 0        47
> 2010-06-21 09:31:00    16.09    16.09   16.07     16.08        226  16.081
> 0       119
> 2010-05-17 09:31:00    18.00    18.03   18.00     18.02        115  18.020
> 0        39
>

This is a potential indication of the differences internal to the socket vs.
activeX.  From the log I get 20100611 14:59:00 as the last data stamp.  That
is how bars get printed by the API as well - they use the time from the
start of the minute, not the following one.  It is dumb - as this can then
introduce a lookahead bias if you aren't aware/paying attention.  Or if you
are merging with other data sources it causes havoc as well.  Point is,
IBrokers isn't doing anything to the timestamp - it is coming from the
TWS/IBG that way. You can set the output to be in POSIX seconds since the
epoch, though I am not too sure what that would do in terms of stamps.  I'll
check ...

Best,
Jeff

> --
> View this message in context:
> http://r.789695.n4.nabble.com/IBrokers-reqHistory-results-in-missing-random-data-tp3533694p3533694.html
> Sent from the Rmetrics mailing list archive at Nabble.com.
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions
> should go.
>



--
Jeffrey Ryan
[hidden email]

www.lemnica.com
www.esotericR.com

        [[alternative HTML version deleted]]

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: IBrokers - reqHistory results in missing random data

algotr8der
On 5/18/11 4:25 PM, Jeffrey Ryan wrote:

> On Wed, May 18, 2011 at 2:33 PM, algotr8der <[hidden email]> wrote:
>
>> Has anyone come across situations where the export of historical data from
>> IB's API has missing data points that occur randomly?
>>
>> I'm not too sure about 'randomly' but I have seen something similar in
> terms of missing dates/times/contracts on occasion.
>
>
>> I did the following to download data for the Select SPDR etfs:
>>
>>> tws <- twsConnect()
>>> contract <- twsEquity('XHB','SMART','ISLAND')
>>> reqHistory(tws, Contract=contract) -> XHB
>> By default this is set to retrieve 1 years worth of minute data. I received
>> all of the data for XLE but noticed that XHB has fewer data points. I
>> pulled
>> up the chart of XHB to examine whether those missing data points showed up
>> on the chart but all was well on the TWS Chart. As per IB, the backfilling
>> on the TWS Chart uses the same data export framework as that used by
>> reqHistoricalData. So I decided to re-download XHB. This time the missing
>> data points were different from the previous download.
>>
> reqHistory isn't much more than an lapply over/around the max download limit
> per call.  Maybe you could send me off-list the output of your request to
> see if I get the same issue.  Another thing to help debug is to run this on
> the IBGateway - and send me a copy of the log file.  setServerLogLevel(tws,
> 5) might do the same as well.
>
> I also would argue with IB that they aren't using the same framework for the
> backfills - since you can do more in the TWS than the API allows - something
> *is* different even at the user level.
Thanks for looking at this Jeff. Appreciate it.

I share your feelings in that something *is* different between the two
frameworks. I had a long discussion with one of IB's API representatives
today in regards to that but did not make much progress there. I have to
write up a ticket but I thought I would do further investigation first.

So I executed reHistory() using the IBGateway as you suggested. I will
upload the logs in a follow-up post as the api log is rather large and I
don't want to plug peoples inboxes.

This time I downloaded XHB the following dates (see below) had
incomplete data. Note the number below the date is a count of the number
of individual data points present for that day. The day post
Thanksgiving should be the only exception as it represents a half
trading day.

split.xts(XHB, f="days") -> testXHB
N <- length(testXHB)
for (i in 1:N) {
        print(index(testXHB[[i]])[1])
        print(dim(testXHB[[i]])[1])
}

[1] "2010-07-19 09:30:00 EDT"
[1] 389
[1] "2010-09-08 09:30:00 EDT"
[1] 389
[1] "2010-10-22 09:30:00 EDT"
[1] 389
[1] "2010-10-26 09:30:00 EDT"
[1] 389
[1] "2010-11-17 09:30:00 EST"
[1] 389
[1] "2010-11-26 09:30:00 EST"
[1] 210
[1] "2010-11-30 09:30:00 EST"
[1] 389
[1] "2010-12-30 09:30:00 EST"
[1] 389
[1] "2010-12-31 09:30:00 EST"
[1] 389
[1] "2011-02-14 09:30:00 EST"
[1] 389
[1] "2011-03-11 09:30:00 EST"
[1] 386
[1] "2011-04-14 09:30:00 EDT"
[1] 389
[1] "2011-04-25 09:30:00 EDT"
[1] 387


>> I used IB's TswDde Excel file to cross verify and I noticed that the data
>> is
>> present using the Excel API. It could be that the problem did not surface
>> because TwsDde limits the export of 1 minute data to 2 days worth of data.
>> I'm speculating here but I do know that downloading via reqHistory produces
>> data with missing data points that occur randomly.
>>
> The excel variant uses ActiveX - and I suspect it isn't really the same as
> the socket version (Java, IBrokers, etc).  Test using the distributed Java
> example program (or write one).  That would be more apples to apples.
Later today or sometime tomorrow I will test a Java example program to
compare to the ActiveX. Will provide further feedback.

>> The other thing I noticed was that the data pulled by reqHistory begins at
>> 9:30:00 and ends at 15:59:00 while the same using TswDde begins at 9:31:00
>> and ends at 16:00:00.
>>
>> 2010-06-18 15:58:00    15.85    15.85   15.84     15.84       1569  15.843
>> 0       337
>> 2010-06-18 15:59:00    15.84    15.85   15.81     15.81       3518  15.828
>> 0       527
>> 2010-06-21 09:30:00    16.04    16.10   16.03     16.09        240  16.047
>> 0        47
>> 2010-06-21 09:31:00    16.09    16.09   16.07     16.08        226  16.081
>> 0       119
>> 2010-05-17 09:31:00    18.00    18.03   18.00     18.02        115  18.020
>> 0        39
>>
> This is a potential indication of the differences internal to the socket vs.
> activeX.  From the log I get 20100611 14:59:00 as the last data stamp.  That
> is how bars get printed by the API as well - they use the time from the
> start of the minute, not the following one.  It is dumb - as this can then
> introduce a lookahead bias if you aren't aware/paying attention.  Or if you
> are merging with other data sources it causes havoc as well.  Point is,
> IBrokers isn't doing anything to the timestamp - it is coming from the
> TWS/IBG that way. You can set the output to be in POSIX seconds since the
> epoch, though I am not too sure what that would do in terms of stamps.  I'll
> check ...
>
> Best,
> Jeff
This is an issue and I really wonder why they are doing this. I need to
follow-up with IB on this.

>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/IBrokers-reqHistory-results-in-missing-random-data-tp3533694p3533694.html
>> Sent from the Rmetrics mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only. If you want to post, subscribe first.
>> -- Also note that this is not the r-help list where general R questions
>> should go.
>>
>
>

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: IBrokers - reqHistory results in missing random data

algotr8der
In reply to this post by algotr8der
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: IBrokers - reqHistory results in missing random data

algotr8der
In reply to this post by algotr8der
I must have spoken too soon. Here is what I have discovered.

1) the time issue is not present when you use TswDde, however it is present when you use TswActiveX. The first 1 minute intraday bar occurs at 09:30:00 and the last bar at 15:59:00 when you export historical data using tswActiveX. The same does not occur when you use TswDde.

2) the missing data issue occurs with both TswDde and TswActiveX.

I haven't been able to use the distributed Java API client yet because I've had technical issues which I am trying to sort out at the moment.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: IBrokers - reqHistory results in missing random data

algotr8der
I am told there is a bug on IB's end. I have asked for further detail. I will provide further information as I become aware.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: IBrokers - reqHistory results in missing random data

Kostas Evangelinos
On Tue, May 24, 2011 at 10:59:35AM -0700, algotr8der wrote:
| I am told there is a bug on IB's end. I have asked for further detail. I will
| provide further information as I become aware.

You might want to try ibfetch - I use this to download historical and realtime
data from IB into csv files daily. I haven't seen issues like the ones you
describe.

http://www.gaffa.net/stuff/ibfetch-0.2.tar.gz

Example:
$ ibfetch -i 20 -l 2 -s 20110402 -S '1 min' AUD.JPY ESc1

Kostas

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: IBrokers - reqHistory results in missing random data

Jeffrey Ryan-2
For one, if this uses the sockets there would be zero difference. IBrokers
is a raw translation (open to read!) of the socket protocol, and simply
would fail if incorrect (it is not).  Unless you are running on the same
data, at the same time, with the same requests your conclusion has no basis
in fact - as it is nothing more than conjecture.

Second, linking to a binary(?!) without some context around it (google
search and directories of .net domain provide nothing) is about as useless
as saying nothing.  Certainly nothing to do with R, or even a
solution/insight - except for those naive enough to run it. Your email and
name aren't anywhere in my records of contributors to R-sig-finance, or R.

I'd suggest this has nothing to do with an R solution and nothing to do with
R at all.  There are myriad ways to accomplish requests - all of the others
aren't suitable to the thread in question.

Best,
Jeff

On Sat, May 28, 2011 at 10:18 AM, Kostas Evangelinos <[hidden email]>wrote:

> On Tue, May 24, 2011 at 10:59:35AM -0700, algotr8der wrote:
> | I am told there is a bug on IB's end. I have asked for further detail. I
> will
> | provide further information as I become aware.
>
> You might want to try ibfetch - I use this to download historical and
> realtime
> data from IB into csv files daily. I haven't seen issues like the ones you
> describe.
>
> http://www.gaffa.net/stuff/ibfetch-0.2.tar.gz
>
> Example:
> $ ibfetch -i 20 -l 2 -s 20110402 -S '1 min' AUD.JPY ESc1
>
> Kostas
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions
> should go.
>



--
Jeffrey Ryan
[hidden email]

www.lemnica.com
www.esotericR.com

        [[alternative HTML version deleted]]

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: IBrokers - reqHistory results in missing random data

algotr8der
>>For one, if this uses the sockets there would be zero difference. IBrokers
>>is a raw translation (open to read!) of the socket protocol, and simply
>>would fail if incorrect (it is not).  Unless you are running on the same
>>data, at the same time, with the same requests your conclusion has no basis
>>in fact - as it is nothing more than conjecture.

Hi Jeff - I must have not been clear in my latest post in this thread - sincere apologies for that. When I said 'I am told there is a bug on IB's end' I meant IB = Interactive Brokers and not IBrokers. Sorry for the confusion. This is something internal to Interactive Brokers so nothing to do with the IBrokers R package.

>>Second, linking to a binary(?!) without some context around it (google
>>search and directories of .net domain provide nothing) is about as useless
>>as saying nothing.  Certainly nothing to do with R, or even a
>>solution/insight - except for those naive enough to run it. Your email and
>>name aren't anywhere in my records of contributors to R-sig-finance, or R.

I'm not sure what you are trying to say in the above paragraph.

>>I'd suggest this has nothing to do with an R solution and nothing to do with
>>R at all.  There are myriad ways to accomplish requests - all of the others
>>aren't suitable to the thread in question.

I agree that this has nothing to do with an R solution hence why I posted an update indicating that the problem occurs with Interactive Broker's own tools:

1) the time issue is not present when you use TswDde, however it is present when you use TswActiveX. The first 1 minute intraday bar occurs at 09:30:00 and the last bar at 15:59:00 when you export historical data using tswActiveX. The same does not occur when you use TswDde.

2) the missing data issue occurs with both TswDde and TswActiveX.

I thought that issue #2 may have had to do with 'poor quality' data rather than a problem with the export mechanism. But Interactive Brokers technical support rep I was working with indicated on several occasions that he did not have missing data during his testing using the same said tools for the same time periods and same symbol. Now this is his *claim* and not something that I can independently verify other than indicate that there are gaps in the data exported from Interactive Brokers using their tools in my testing.

All the best.
AT


Loading...