Quantcast

xts timeseries as shared-memory objects with bigmemory package ?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

xts timeseries as shared-memory objects with bigmemory package ?

soren wilkening
Hi to the list

I am looking into inter-process communication between one or more
instances of R (i.e. one process to receive marketdata, the other one to
execute a strategy and place orders)

I have seen that, with the 'bigmemory' package, a matrix defined as

shared <- big.matrix(....)

can be accessed by several different R processes, quite easily. Is there
an easy way to used this mechanism for sharing an xts timeseries ? I
havent seen how to do that.

My best guess would be to define the shared matrix as type='double' and
every time I need to read from or write to it, make the corresponding
matrix<==>xts conversion.  That would probably work since the matrix
itself is not huge, but it would be a bit more elegant if I could share
the actual xts object.

Thankful for any ideas

regards

Soren

http://censix.com

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: xts timeseries as shared-memory objects with bigmemory package ?

Jeffrey Ryan-2
Hi Soren,

There are a couple alternatives that are closer to the 'metal' if you
will.  The first, and (maybe) recommended way would be to use the mmap
package and a struct.  mmap doesn't support dimensioned data (yet),
but the struct is quite useful in that you can have different types
embedded in something of a list format.

I'll be giving a short lightning talk on this at R/Finance 2011 that
we are hosting in Chicago next month - which I hope to see everyone at
;-)  In case anyone missed the emails
http://www.RinFinance.com/register

For now, the free, quick, incomplete and ugly version:

> library(mmap)
> library(xts)
Loading required package: zoo
> x <- xts(1:10,Sys.time()+1:10)
> x
                    [,1]
2011-03-17 12:48:16    1
2011-03-17 12:48:17    2
2011-03-17 12:48:18    3
2011-03-17 12:48:19    4
2011-03-17 12:48:20    5
2011-03-17 12:48:21    6
2011-03-17 12:48:22    7
2011-03-17 12:48:23    8
2011-03-17 12:48:24    9
2011-03-17 12:48:25   10

# create an on disk object of the correct size
> tmp <- tempfile()
> length(coredata(x)) * sizeof(double) + length(.index(x)) * sizeof(double)
[1] 160
> writeBin(raw(160),tmp)

# create a mapping to it using the struct() construct in mmap

> m <- mmap(tmp, struct(double(),double()))

# assign the components, keeping in mind that you'd need a multiple
columns to handle multiple xts columns
> m[,1] <- .index(x)
> m[,2] <- coredata(x)[,1]

# the raw on disk data
> m[]
[[1]]
 [1] 1300384097 1300384098 1300384099 1300384100 1300384101 1300384102
 [7] 1300384103 1300384104 1300384105 1300384106

[[2]]
 [1]  1  2  3  4  5  6  7  8  9 10

# the 'magic'
> extractFUN(m) <- function(x) .xts(x[[2]],x[[1]])
> m[]
                    [,1]
2011-03-17 12:48:16    1
2011-03-17 12:48:17    2
2011-03-17 12:48:18    3
2011-03-17 12:48:19    4
2011-03-17 12:48:20    5
2011-03-17 12:48:21    6
2011-03-17 12:48:22    7
2011-03-17 12:48:23    8
2011-03-17 12:48:24    9
2011-03-17 12:48:25   10

Some things lost will be tz, time class, colnames, and misc
attributes. (same as bigmemory, ff of course)

All of the above is admittedly very, very ugly .. but will work across
processes.  You just need to know the name of the file that is being
mmapped/shared.  (i.e. don't use tempfile() ... make your own name).
The advantage of a struct is that we can find rows/columns without
loading the whole into memory. The disadvantage in some cases will be
that a struct in mmap is like a row-major db, whereas xts is like a
col-major one.  If that doesn't make sense to you, pretend I didn't
say it. ;-)

By the conference I'll have some simple wrappers to make this as easy
as the rest of xts - e.g. as.mmap will likely get an xts method.  I'll
also talk more about sharing objects across sessions in my Friday
morning workshop at the conference.

Another alternate approach that works well is using a key-value db -
something like RBerkeley from R.  This is incredibly fast and very
simple (see the vignette in the package) - with the only caveat being
that I can't get Oracle's BDB to compile using the R windows
toolchain, so RBerkeley gets nothing to link against if you are one of
the luckless souls bound to the Redmond OS.  If someone solves that
problem for the community (read: won't likely be me), it too will be a
solid cross-platform solution.

Jeff

On Thu, Mar 17, 2011 at 12:40 PM,  <[hidden email]> wrote:

> Hi to the list
>
> I am looking into inter-process communication between one or more
> instances of R (i.e. one process to receive marketdata, the other one to
> execute a strategy and place orders)
>
> I have seen that, with the 'bigmemory' package, a matrix defined as
>
> shared <- big.matrix(....)
>
> can be accessed by several different R processes, quite easily. Is there
> an easy way to used this mechanism for sharing an xts timeseries ? I
> havent seen how to do that.
>
> My best guess would be to define the shared matrix as type='double' and
> every time I need to read from or write to it, make the corresponding
> matrix<==>xts conversion.  That would probably work since the matrix
> itself is not huge, but it would be a bit more elegant if I could share
> the actual xts object.
>
> Thankful for any ideas
>
> regards
>
> Soren
>
> http://censix.com
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.
>



--
Jeffrey Ryan
[hidden email]

www.lemnica.com

R/Finance 2011 April 29th and 30th in Chicago
www.RinFinance.com

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: xts timeseries as shared-memory objects with bigmemory package ?

soren wilkening
Jeff, that's amazing. Thanks for sharing!!


> Hi Soren,
>
> There are a couple alternatives that are closer to the 'metal' if you
> will.  The first, and (maybe) recommended way would be to use the mmap
> package and a struct.  mmap doesn't support dimensioned data (yet),
> but the struct is quite useful in that you can have different types
> embedded in something of a list format.
>
> I'll be giving a short lightning talk on this at R/Finance 2011 that
> we are hosting in Chicago next month - which I hope to see everyone at
> ;-)  In case anyone missed the emails
> http://www.RinFinance.com/register
>
> For now, the free, quick, incomplete and ugly version:
>
>> library(mmap)
>> library(xts)
> Loading required package: zoo
>> x <- xts(1:10,Sys.time()+1:10)
>> x
>                     [,1]
> 2011-03-17 12:48:16    1
> 2011-03-17 12:48:17    2
> 2011-03-17 12:48:18    3
> 2011-03-17 12:48:19    4
> 2011-03-17 12:48:20    5
> 2011-03-17 12:48:21    6
> 2011-03-17 12:48:22    7
> 2011-03-17 12:48:23    8
> 2011-03-17 12:48:24    9
> 2011-03-17 12:48:25   10
>
> # create an on disk object of the correct size
>> tmp <- tempfile()
>> length(coredata(x)) * sizeof(double) + length(.index(x)) *
>> sizeof(double)
> [1] 160
>> writeBin(raw(160),tmp)
>
> # create a mapping to it using the struct() construct in mmap
>
>> m <- mmap(tmp, struct(double(),double()))
>
> # assign the components, keeping in mind that you'd need a multiple
> columns to handle multiple xts columns
>> m[,1] <- .index(x)
>> m[,2] <- coredata(x)[,1]
>
> # the raw on disk data
>> m[]
> [[1]]
>  [1] 1300384097 1300384098 1300384099 1300384100 1300384101 1300384102
>  [7] 1300384103 1300384104 1300384105 1300384106
>
> [[2]]
>  [1]  1  2  3  4  5  6  7  8  9 10
>
> # the 'magic'
>> extractFUN(m) <- function(x) .xts(x[[2]],x[[1]])
>> m[]
>                     [,1]
> 2011-03-17 12:48:16    1
> 2011-03-17 12:48:17    2
> 2011-03-17 12:48:18    3
> 2011-03-17 12:48:19    4
> 2011-03-17 12:48:20    5
> 2011-03-17 12:48:21    6
> 2011-03-17 12:48:22    7
> 2011-03-17 12:48:23    8
> 2011-03-17 12:48:24    9
> 2011-03-17 12:48:25   10
>
> Some things lost will be tz, time class, colnames, and misc
> attributes. (same as bigmemory, ff of course)
>
> All of the above is admittedly very, very ugly .. but will work across
> processes.  You just need to know the name of the file that is being
> mmapped/shared.  (i.e. don't use tempfile() ... make your own name).
> The advantage of a struct is that we can find rows/columns without
> loading the whole into memory. The disadvantage in some cases will be
> that a struct in mmap is like a row-major db, whereas xts is like a
> col-major one.  If that doesn't make sense to you, pretend I didn't
> say it. ;-)
>
> By the conference I'll have some simple wrappers to make this as easy
> as the rest of xts - e.g. as.mmap will likely get an xts method.  I'll
> also talk more about sharing objects across sessions in my Friday
> morning workshop at the conference.
>
> Another alternate approach that works well is using a key-value db -
> something like RBerkeley from R.  This is incredibly fast and very
> simple (see the vignette in the package) - with the only caveat being
> that I can't get Oracle's BDB to compile using the R windows
> toolchain, so RBerkeley gets nothing to link against if you are one of
> the luckless souls bound to the Redmond OS.  If someone solves that
> problem for the community (read: won't likely be me), it too will be a
> solid cross-platform solution.
>
> Jeff
>
> On Thu, Mar 17, 2011 at 12:40 PM,  <[hidden email]> wrote:
>> Hi to the list
>>
>> I am looking into inter-process communication between one or more
>> instances of R (i.e. one process to receive marketdata, the other one to
>> execute a strategy and place orders)
>>
>> I have seen that, with the 'bigmemory' package, a matrix defined as
>>
>> shared <- big.matrix(....)
>>
>> can be accessed by several different R processes, quite easily. Is there
>> an easy way to used this mechanism for sharing an xts timeseries ? I
>> havent seen how to do that.
>>
>> My best guess would be to define the shared matrix as type='double' and
>> every time I need to read from or write to it, make the corresponding
>> matrix<==>xts conversion.  That would probably work since the matrix
>> itself is not huge, but it would be a bit more elegant if I could share
>> the actual xts object.
>>
>> Thankful for any ideas
>>
>> regards
>>
>> Soren
>>
>> http://censix.com
>>
>> _______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only. If you want to post, subscribe first.
>> -- Also note that this is not the r-help list where general R questions
>> should go.
>>
>
>
>
> --
> Jeffrey Ryan
> [hidden email]
>
> www.lemnica.com
>
> R/Finance 2011 April 29th and 30th in Chicago
> www.RinFinance.com
>

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Loading...