outer join of xts's

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

outer join of xts's

Eric Berger
Hi,
I have a list L of about 2,600 xts's.
Each xts has a single numeric column. About 90% of the xts's have
approximately 500 rows, and the rest have fewer than 500 rows.
I create a single xts using the command

myXts <- Reduce( merge.xts, L )

By default, merge.xts() does an outer join (which is what I want).

The command takes about 80 seconds to complete.
I have plenty of RAM on my computer.

Are there faster ways to accomplish this task?

Thanks,
Eric

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: outer join of xts's

Enrico Schumann-2

Quoting Eric Berger <[hidden email]>:

> Hi,
> I have a list L of about 2,600 xts's.
> Each xts has a single numeric column. About 90% of the xts's have
> approximately 500 rows, and the rest have fewer than 500 rows.
> I create a single xts using the command
>
> myXts <- Reduce( merge.xts, L )
>
> By default, merge.xts() does an outer join (which is what I want).
>
> The command takes about 80 seconds to complete.
> I have plenty of RAM on my computer.
>
> Are there faster ways to accomplish this task?
>
> Thanks,
> Eric
>

Since you already know the number of series and all possible timestamps,
you could preallocate a matrix (number of timestamps times number of series).
You could use the fastmatch package to match the timestamps against the rows.
This what 'pricetable' in the PMwR package does.  Calling

     library("PMwR")
     do.call(pricetable, L)

should give you matrix of the merged series, with an attribute 'timestamp',
from which you could create an xts object again.

I am not sure if it is the fastest way, but it's probably faster than calling
merge repeatedly.

kind regards
     Enrico  (the maintainer of PMwR)

--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: outer join of xts's

Gabor Grothendieck
In reply to this post by Eric Berger
You don't need Reduce as xts already supports mutliway merges.  This
perfroms one
multiway merge rather than  k-1 two way merges.

    do.call("merge", L)

On Thu, Jan 2, 2020 at 6:13 AM Eric Berger <[hidden email]> wrote:

>
> Hi,
> I have a list L of about 2,600 xts's.
> Each xts has a single numeric column. About 90% of the xts's have
> approximately 500 rows, and the rest have fewer than 500 rows.
> I create a single xts using the command
>
> myXts <- Reduce( merge.xts, L )
>
> By default, merge.xts() does an outer join (which is what I want).
>
> The command takes about 80 seconds to complete.
> I have plenty of RAM on my computer.
>
> Are there faster ways to accomplish this task?
>
> Thanks,
> Eric
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: outer join of xts's

Eric Berger
Hi Gabor,
This is great, thanks. It brought the time down to about 4 seconds.
The command
do.call("merge.xts",L)
also works in this case.
Suppose that instead of the default "outer" join I wanted to use, say, a
"left" join.
Is that possible? I tried a few ways of adding the
join="left"
parameter to the do.call() command but I could not get the syntax to work
(assuming it's even possible).

Thanks,
Eric


On Thu, Jan 2, 2020 at 3:23 PM Gabor Grothendieck <[hidden email]>
wrote:

> You don't need Reduce as xts already supports mutliway merges.  This
> perfroms one
> multiway merge rather than  k-1 two way merges.
>
>     do.call("merge", L)
>
> On Thu, Jan 2, 2020 at 6:13 AM Eric Berger <[hidden email]> wrote:
> >
> > Hi,
> > I have a list L of about 2,600 xts's.
> > Each xts has a single numeric column. About 90% of the xts's have
> > approximately 500 rows, and the rest have fewer than 500 rows.
> > I create a single xts using the command
> >
> > myXts <- Reduce( merge.xts, L )
> >
> > By default, merge.xts() does an outer join (which is what I want).
> >
> > The command takes about 80 seconds to complete.
> > I have plenty of RAM on my computer.
> >
> > Are there faster ways to accomplish this task?
> >
> > Thanks,
> > Eric
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: outer join of xts's

Gabor Grothendieck
It is not clear what multiway left join means but merge.zoo (though
not merge.xts) supports a generalized all= argument which is a logical
vector having the same length as L that can be TRUE or FALSE for each
object merged.  The objects corresponding to TRUE will have all their
times included in the result but the ones with FALSE will only be
included if they correspond to an already existing time. merge.zoo is
R based whereas merge.xts is C based so I would not expect it to be as
fast although it is more powerful.

If All is the logical vector having the same length as L then:

    Lzoo <- lapply(L, as.zoo)
    do.call("merge",  c(Lzoo, list(all = All))

On Thu, Jan 2, 2020 at 9:31 AM Eric Berger <[hidden email]> wrote:

>
> Hi Gabor,
> This is great, thanks. It brought the time down to about 4 seconds.
> The command
> do.call("merge.xts",L)
> also works in this case.
> Suppose that instead of the default "outer" join I wanted to use, say, a "left" join.
> Is that possible? I tried a few ways of adding the
> join="left"
> parameter to the do.call() command but I could not get the syntax to work (assuming it's even possible).
>
> Thanks,
> Eric
>
>
> On Thu, Jan 2, 2020 at 3:23 PM Gabor Grothendieck <[hidden email]> wrote:
>>
>> You don't need Reduce as xts already supports mutliway merges.  This
>> perfroms one
>> multiway merge rather than  k-1 two way merges.
>>
>>     do.call("merge", L)
>>
>> On Thu, Jan 2, 2020 at 6:13 AM Eric Berger <[hidden email]> wrote:
>> >
>> > Hi,
>> > I have a list L of about 2,600 xts's.
>> > Each xts has a single numeric column. About 90% of the xts's have
>> > approximately 500 rows, and the rest have fewer than 500 rows.
>> > I create a single xts using the command
>> >
>> > myXts <- Reduce( merge.xts, L )
>> >
>> > By default, merge.xts() does an outer join (which is what I want).
>> >
>> > The command takes about 80 seconds to complete.
>> > I have plenty of RAM on my computer.
>> >
>> > Are there faster ways to accomplish this task?
>> >
>> > Thanks,
>> > Eric
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>> Statistics & Software Consulting
>> GKX Group, GKX Associates Inc.
>> tel: 1-877-GKX-GROUP
>> email: ggrothendieck at gmail.com



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: outer join of xts's

Gabor Grothendieck
> It is not clear what multiway left join means but merge.zoo (though
> not merge.xts) supports a generalized all= argument which is a logical
> vector having the same length as L that can be TRUE or FALSE for each
> object merged.  The objects corresponding to TRUE will have all their
> times included in the result but the ones with FALSE will only be
> included if they correspond to an already existing time. merge.zoo is
> R based whereas merge.xts is C based so I would not expect it to be as
> fast although it is more powerful.
>
> If All is the logical vector having the same length as L then:
>
>     Lzoo <- lapply(L, as.zoo)
>     do.call("merge",  c(Lzoo, list(all = All))
>
> On Thu, Jan 2, 2020 at 9:31 AM Eric Berger <[hidden email]> wrote:
> >
> > Hi Gabor,
> > This is great, thanks. It brought the time down to about 4 seconds.
> > The command
> > do.call("merge.xts",L)
> > also works in this case.
> > Suppose that instead of the default "outer" join I wanted to use, say, a "left" join.
> > Is that possible? I tried a few ways of adding the
> > join="left"
> > parameter to the do.call() command but I could not get the syntax to work (assuming it's even possible).
> >
> > Thanks,
> > Eric
> >
> >
> > On Thu, Jan 2, 2020 at 3:23 PM Gabor Grothendieck <[hidden email]> wrote:
> >>
> >> You don't need Reduce as xts already supports mutliway merges.  This
> >> perfroms one
> >> multiway merge rather than  k-1 two way merges.
> >>
> >>     do.call("merge", L)
> >>
> >> On Thu, Jan 2, 2020 at 6:13 AM Eric Berger <[hidden email]> wrote:
> >> >
> >> > Hi,
> >> > I have a list L of about 2,600 xts's.
> >> > Each xts has a single numeric column. About 90% of the xts's have
> >> > approximately 500 rows, and the rest have fewer than 500 rows.
> >> > I create a single xts using the command
> >> >
> >> > myXts <- Reduce( merge.xts, L )
> >> >
> >> > By default, merge.xts() does an outer join (which is what I want).
> >> >
> >> > The command takes about 80 seconds to complete.
> >> > I have plenty of RAM on my computer.
> >> >
> >> > Are there faster ways to accomplish this task?
> >> >
> >> > Thanks,
> >> > Eric
> >> >
> >> >         [[alternative HTML version deleted]]
> >> >
> >> > ______________________________________________
> >> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
> >>
> >> --
> >> Statistics & Software Consulting
> >> GKX Group, GKX Associates Inc.
> >> tel: 1-877-GKX-GROUP
> >> email: ggrothendieck at gmail.com
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: outer join of xts's

Duncan Murdoch-2
In reply to this post by Eric Berger
On 02/01/2020 9:31 a.m., Eric Berger wrote:

> Hi Gabor,
> This is great, thanks. It brought the time down to about 4 seconds.
> The command
> do.call("merge.xts",L)
> also works in this case.
> Suppose that instead of the default "outer" join I wanted to use, say, a
> "left" join.
> Is that possible? I tried a few ways of adding the
> join="left"
> parameter to the do.call() command but I could not get the syntax to work
> (assuming it's even possible).

This should work:

   do.call("merge", c(L, join = "left"))

The second argument to do.call is a list which becomes the arguments to
the function being called.  Your time series should be unnamed entries
in the list, while other arguments to merge() should be named.

Duncan Murdoch

>
> Thanks,
> Eric
>
>
> On Thu, Jan 2, 2020 at 3:23 PM Gabor Grothendieck <[hidden email]>
> wrote:
>
>> You don't need Reduce as xts already supports mutliway merges.  This
>> perfroms one
>> multiway merge rather than  k-1 two way merges.
>>
>>      do.call("merge", L)
>>
>> On Thu, Jan 2, 2020 at 6:13 AM Eric Berger <[hidden email]> wrote:
>>>
>>> Hi,
>>> I have a list L of about 2,600 xts's.
>>> Each xts has a single numeric column. About 90% of the xts's have
>>> approximately 500 rows, and the rest have fewer than 500 rows.
>>> I create a single xts using the command
>>>
>>> myXts <- Reduce( merge.xts, L )
>>>
>>> By default, merge.xts() does an outer join (which is what I want).
>>>
>>> The command takes about 80 seconds to complete.
>>> I have plenty of RAM on my computer.
>>>
>>> Are there faster ways to accomplish this task?
>>>
>>> Thanks,
>>> Eric
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>> Statistics & Software Consulting
>> GKX Group, GKX Associates Inc.
>> tel: 1-877-GKX-GROUP
>> email: ggrothendieck at gmail.com
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: outer join of xts's

Gabor Grothendieck
join = "left" only applies with merge.xts if there are two objects.
If there are more it acts the same as join = TRUE..
See the Details section of ?merge.xts

On Thu, Jan 2, 2020 at 1:29 PM Duncan Murdoch <[hidden email]> wrote:

>
> On 02/01/2020 9:31 a.m., Eric Berger wrote:
> > Hi Gabor,
> > This is great, thanks. It brought the time down to about 4 seconds.
> > The command
> > do.call("merge.xts",L)
> > also works in this case.
> > Suppose that instead of the default "outer" join I wanted to use, say, a
> > "left" join.
> > Is that possible? I tried a few ways of adding the
> > join="left"
> > parameter to the do.call() command but I could not get the syntax to work
> > (assuming it's even possible).
>
> This should work:
>
>    do.call("merge", c(L, join = "left"))
>
> The second argument to do.call is a list which becomes the arguments to
> the function being called.  Your time series should be unnamed entries
> in the list, while other arguments to merge() should be named.
>
> Duncan Murdoch
>
> >
> > Thanks,
> > Eric
> >
> >
> > On Thu, Jan 2, 2020 at 3:23 PM Gabor Grothendieck <[hidden email]>
> > wrote:
> >
> >> You don't need Reduce as xts already supports mutliway merges.  This
> >> perfroms one
> >> multiway merge rather than  k-1 two way merges.
> >>
> >>      do.call("merge", L)
> >>
> >> On Thu, Jan 2, 2020 at 6:13 AM Eric Berger <[hidden email]> wrote:
> >>>
> >>> Hi,
> >>> I have a list L of about 2,600 xts's.
> >>> Each xts has a single numeric column. About 90% of the xts's have
> >>> approximately 500 rows, and the rest have fewer than 500 rows.
> >>> I create a single xts using the command
> >>>
> >>> myXts <- Reduce( merge.xts, L )
> >>>
> >>> By default, merge.xts() does an outer join (which is what I want).
> >>>
> >>> The command takes about 80 seconds to complete.
> >>> I have plenty of RAM on my computer.
> >>>
> >>> Are there faster ways to accomplish this task?
> >>>
> >>> Thanks,
> >>> Eric
> >>>
> >>>          [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
> >>
> >> --
> >> Statistics & Software Consulting
> >> GKX Group, GKX Associates Inc.
> >> tel: 1-877-GKX-GROUP
> >> email: ggrothendieck at gmail.com
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>


--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: outer join of xts's

Eric Berger
Hi Gabor and Duncan,
Thanks for your comments. As Gabor points out, Duncan's suggestion
does not work.
For those interested, here is some minimal reproducible example to illustrate

library(xts)
dtV <- as.Date("2019-01-01")+1:5
a <- xts(x=rnorm(5),order.by=dtV)
a1 <- a[1:3,]
a2 <- a[2:4,]
a3 <- a[3:5,]
colnames(a1) <- "a1"
colnames(a2) <- "a2"
colnames(a3) <- "a3"
L <- list(a1,a2,a3)
b.outer.1 <- Reduce(merge.xts,L)
b.outer.2 <- do.call(merge.xts,L)
identical(b.outer.1,b.outer.2)
# TRUE
dim(b.outer.1)
# [1] 5 3
b.left.1 <- merge.xts( merge.xts(a1,a2,join="left"), a3, join="left" )
f <- function(x,y) { merge.xts(x,y,join="left")}
b.left.2 <- Reduce(f,L)
identical(b.left.1,b.left.2)
# TRUE
dim(b.left.1)
# [1] 3 3
b.left.3 <- do.call("merge",c(L,join="left"))
# Warning message:
# In merge.xts(c(0.316095105296857, -1.69318390538755, -1.16430042971811 :
#  'join' only applicable to two object merges
dim(b.left.3)
# [1] 5 3
identical(b.outer.1,b.left.3)
# TRUE



On Thu, Jan 2, 2020 at 11:39 PM Gabor Grothendieck
<[hidden email]> wrote:

>
> join = "left" only applies with merge.xts if there are two objects.
> If there are more it acts the same as join = TRUE..
> See the Details section of ?merge.xts
>
> On Thu, Jan 2, 2020 at 1:29 PM Duncan Murdoch <[hidden email]> wrote:
> >
> > On 02/01/2020 9:31 a.m., Eric Berger wrote:
> > > Hi Gabor,
> > > This is great, thanks. It brought the time down to about 4 seconds.
> > > The command
> > > do.call("merge.xts",L)
> > > also works in this case.
> > > Suppose that instead of the default "outer" join I wanted to use, say, a
> > > "left" join.
> > > Is that possible? I tried a few ways of adding the
> > > join="left"
> > > parameter to the do.call() command but I could not get the syntax to work
> > > (assuming it's even possible).
> >
> > This should work:
> >
> >    do.call("merge", c(L, join = "left"))
> >
> > The second argument to do.call is a list which becomes the arguments to
> > the function being called.  Your time series should be unnamed entries
> > in the list, while other arguments to merge() should be named.
> >
> > Duncan Murdoch
> >
> > >
> > > Thanks,
> > > Eric
> > >
> > >
> > > On Thu, Jan 2, 2020 at 3:23 PM Gabor Grothendieck <[hidden email]>
> > > wrote:
> > >
> > >> You don't need Reduce as xts already supports mutliway merges.  This
> > >> perfroms one
> > >> multiway merge rather than  k-1 two way merges.
> > >>
> > >>      do.call("merge", L)
> > >>
> > >> On Thu, Jan 2, 2020 at 6:13 AM Eric Berger <[hidden email]> wrote:
> > >>>
> > >>> Hi,
> > >>> I have a list L of about 2,600 xts's.
> > >>> Each xts has a single numeric column. About 90% of the xts's have
> > >>> approximately 500 rows, and the rest have fewer than 500 rows.
> > >>> I create a single xts using the command
> > >>>
> > >>> myXts <- Reduce( merge.xts, L )
> > >>>
> > >>> By default, merge.xts() does an outer join (which is what I want).
> > >>>
> > >>> The command takes about 80 seconds to complete.
> > >>> I have plenty of RAM on my computer.
> > >>>
> > >>> Are there faster ways to accomplish this task?
> > >>>
> > >>> Thanks,
> > >>> Eric
> > >>>
> > >>>          [[alternative HTML version deleted]]
> > >>>
> > >>> ______________________________________________
> > >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >>
> > >>
> > >>
> > >> --
> > >> Statistics & Software Consulting
> > >> GKX Group, GKX Associates Inc.
> > >> tel: 1-877-GKX-GROUP
> > >> email: ggrothendieck at gmail.com
> > >>
> > >
> > >       [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: outer join of xts's

Joshua Ulrich
On Fri, Jan 3, 2020 at 1:14 AM Eric Berger <[hidden email]> wrote:

>
> Hi Gabor and Duncan,
> Thanks for your comments. As Gabor points out, Duncan's suggestion
> does not work.
> For those interested, here is some minimal reproducible example to illustrate
>
> library(xts)
> dtV <- as.Date("2019-01-01")+1:5
> a <- xts(x=rnorm(5),order.by=dtV)
> a1 <- a[1:3,]
> a2 <- a[2:4,]
> a3 <- a[3:5,]
> colnames(a1) <- "a1"
> colnames(a2) <- "a2"
> colnames(a3) <- "a3"
> L <- list(a1,a2,a3)
> b.outer.1 <- Reduce(merge.xts,L)
> b.outer.2 <- do.call(merge.xts,L)
> identical(b.outer.1,b.outer.2)
> # TRUE
> dim(b.outer.1)
> # [1] 5 3
> b.left.1 <- merge.xts( merge.xts(a1,a2,join="left"), a3, join="left" )
> f <- function(x,y) { merge.xts(x,y,join="left")}
> b.left.2 <- Reduce(f,L)
> identical(b.left.1,b.left.2)
> # TRUE
> dim(b.left.1)
> # [1] 3 3
> b.left.3 <- do.call("merge",c(L,join="left"))
> # Warning message:
> # In merge.xts(c(0.316095105296857, -1.69318390538755, -1.16430042971811 :
> #  'join' only applicable to two object merges
> dim(b.left.3)
> # [1] 5 3
> identical(b.outer.1,b.left.3)
> # TRUE
>
It's good practice to call generics and let dispatch determine what
method to call, instead of calling the method directly.  There's no
guarantee that merge.xts() will handle any objects you may
accidentally pass to it. So you should replace all your merge.xts()
calls with merge().

>
>
> On Thu, Jan 2, 2020 at 11:39 PM Gabor Grothendieck
> <[hidden email]> wrote:
> >
> > join = "left" only applies with merge.xts if there are two objects.
> > If there are more it acts the same as join = TRUE..
> > See the Details section of ?merge.xts
> >
> > On Thu, Jan 2, 2020 at 1:29 PM Duncan Murdoch <[hidden email]> wrote:
> > >
> > > On 02/01/2020 9:31 a.m., Eric Berger wrote:
> > > > Hi Gabor,
> > > > This is great, thanks. It brought the time down to about 4 seconds.
> > > > The command
> > > > do.call("merge.xts",L)
> > > > also works in this case.
> > > > Suppose that instead of the default "outer" join I wanted to use, say, a
> > > > "left" join.
> > > > Is that possible? I tried a few ways of adding the
> > > > join="left"
> > > > parameter to the do.call() command but I could not get the syntax to work
> > > > (assuming it's even possible).
> > >
> > > This should work:
> > >
> > >    do.call("merge", c(L, join = "left"))
> > >
> > > The second argument to do.call is a list which becomes the arguments to
> > > the function being called.  Your time series should be unnamed entries
> > > in the list, while other arguments to merge() should be named.
> > >
> > > Duncan Murdoch
> > >
> > > >
> > > > Thanks,
> > > > Eric
> > > >
> > > >
> > > > On Thu, Jan 2, 2020 at 3:23 PM Gabor Grothendieck <[hidden email]>
> > > > wrote:
> > > >
> > > >> You don't need Reduce as xts already supports mutliway merges.  This
> > > >> perfroms one
> > > >> multiway merge rather than  k-1 two way merges.
> > > >>
> > > >>      do.call("merge", L)
> > > >>
> > > >> On Thu, Jan 2, 2020 at 6:13 AM Eric Berger <[hidden email]> wrote:
> > > >>>
> > > >>> Hi,
> > > >>> I have a list L of about 2,600 xts's.
> > > >>> Each xts has a single numeric column. About 90% of the xts's have
> > > >>> approximately 500 rows, and the rest have fewer than 500 rows.
> > > >>> I create a single xts using the command
> > > >>>
> > > >>> myXts <- Reduce( merge.xts, L )
> > > >>>
> > > >>> By default, merge.xts() does an outer join (which is what I want).
> > > >>>
> > > >>> The command takes about 80 seconds to complete.
> > > >>> I have plenty of RAM on my computer.
> > > >>>
> > > >>> Are there faster ways to accomplish this task?
> > > >>>
> > > >>> Thanks,
> > > >>> Eric
> > > >>>
> > > >>>          [[alternative HTML version deleted]]
> > > >>>
> > > >>> ______________________________________________
> > > >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > > >>> PLEASE do read the posting guide
> > > >> http://www.R-project.org/posting-guide.html
> > > >>> and provide commented, minimal, self-contained, reproducible code.
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Statistics & Software Consulting
> > > >> GKX Group, GKX Associates Inc.
> > > >> tel: 1-877-GKX-GROUP
> > > >> email: ggrothendieck at gmail.com
> > > >>
> > > >
> > > >       [[alternative HTML version deleted]]
> > > >
> > > > ______________________________________________
> > > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > > >
> > >
> >
> >
> > --
> > Statistics & Software Consulting
> > GKX Group, GKX Associates Inc.
> > tel: 1-877-GKX-GROUP
> > email: ggrothendieck at gmail.com
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: outer join of xts's

Eric Berger
Hi Joshua,
Thanks for the comment but I guess I prefer a different behavior. If
merge.xts() cannot handle the objects I pass in, I want it to fail.
In fact, I typically adopt a naming convention where the variable name
indicates the type. In my normal coding style my example would look
like:
aXts <- xts(x=rnorm(5),order.by=dtV)
a1Xts <- aXts[1:3,]
a2Xts <- aXts[2:4,]
etc
Happy to hear your views on why other conventions might be better.

Regards,
Eric


On Fri, Jan 3, 2020 at 3:45 PM Joshua Ulrich <[hidden email]> wrote:

>
> On Fri, Jan 3, 2020 at 1:14 AM Eric Berger <[hidden email]> wrote:
> >
> > Hi Gabor and Duncan,
> > Thanks for your comments. As Gabor points out, Duncan's suggestion
> > does not work.
> > For those interested, here is some minimal reproducible example to illustrate
> >
> > library(xts)
> > dtV <- as.Date("2019-01-01")+1:5
> > a <- xts(x=rnorm(5),order.by=dtV)
> > a1 <- a[1:3,]
> > a2 <- a[2:4,]
> > a3 <- a[3:5,]
> > colnames(a1) <- "a1"
> > colnames(a2) <- "a2"
> > colnames(a3) <- "a3"
> > L <- list(a1,a2,a3)
> > b.outer.1 <- Reduce(merge.xts,L)
> > b.outer.2 <- do.call(merge.xts,L)
> > identical(b.outer.1,b.outer.2)
> > # TRUE
> > dim(b.outer.1)
> > # [1] 5 3
> > b.left.1 <- merge.xts( merge.xts(a1,a2,join="left"), a3, join="left" )
> > f <- function(x,y) { merge.xts(x,y,join="left")}
> > b.left.2 <- Reduce(f,L)
> > identical(b.left.1,b.left.2)
> > # TRUE
> > dim(b.left.1)
> > # [1] 3 3
> > b.left.3 <- do.call("merge",c(L,join="left"))
> > # Warning message:
> > # In merge.xts(c(0.316095105296857, -1.69318390538755, -1.16430042971811 :
> > #  'join' only applicable to two object merges
> > dim(b.left.3)
> > # [1] 5 3
> > identical(b.outer.1,b.left.3)
> > # TRUE
> >
> It's good practice to call generics and let dispatch determine what
> method to call, instead of calling the method directly.  There's no
> guarantee that merge.xts() will handle any objects you may
> accidentally pass to it. So you should replace all your merge.xts()
> calls with merge().
>
> >
> >
> > On Thu, Jan 2, 2020 at 11:39 PM Gabor Grothendieck
> > <[hidden email]> wrote:
> > >
> > > join = "left" only applies with merge.xts if there are two objects.
> > > If there are more it acts the same as join = TRUE..
> > > See the Details section of ?merge.xts
> > >
> > > On Thu, Jan 2, 2020 at 1:29 PM Duncan Murdoch <[hidden email]> wrote:
> > > >
> > > > On 02/01/2020 9:31 a.m., Eric Berger wrote:
> > > > > Hi Gabor,
> > > > > This is great, thanks. It brought the time down to about 4 seconds.
> > > > > The command
> > > > > do.call("merge.xts",L)
> > > > > also works in this case.
> > > > > Suppose that instead of the default "outer" join I wanted to use, say, a
> > > > > "left" join.
> > > > > Is that possible? I tried a few ways of adding the
> > > > > join="left"
> > > > > parameter to the do.call() command but I could not get the syntax to work
> > > > > (assuming it's even possible).
> > > >
> > > > This should work:
> > > >
> > > >    do.call("merge", c(L, join = "left"))
> > > >
> > > > The second argument to do.call is a list which becomes the arguments to
> > > > the function being called.  Your time series should be unnamed entries
> > > > in the list, while other arguments to merge() should be named.
> > > >
> > > > Duncan Murdoch
> > > >
> > > > >
> > > > > Thanks,
> > > > > Eric
> > > > >
> > > > >
> > > > > On Thu, Jan 2, 2020 at 3:23 PM Gabor Grothendieck <[hidden email]>
> > > > > wrote:
> > > > >
> > > > >> You don't need Reduce as xts already supports mutliway merges.  This
> > > > >> perfroms one
> > > > >> multiway merge rather than  k-1 two way merges.
> > > > >>
> > > > >>      do.call("merge", L)
> > > > >>
> > > > >> On Thu, Jan 2, 2020 at 6:13 AM Eric Berger <[hidden email]> wrote:
> > > > >>>
> > > > >>> Hi,
> > > > >>> I have a list L of about 2,600 xts's.
> > > > >>> Each xts has a single numeric column. About 90% of the xts's have
> > > > >>> approximately 500 rows, and the rest have fewer than 500 rows.
> > > > >>> I create a single xts using the command
> > > > >>>
> > > > >>> myXts <- Reduce( merge.xts, L )
> > > > >>>
> > > > >>> By default, merge.xts() does an outer join (which is what I want).
> > > > >>>
> > > > >>> The command takes about 80 seconds to complete.
> > > > >>> I have plenty of RAM on my computer.
> > > > >>>
> > > > >>> Are there faster ways to accomplish this task?
> > > > >>>
> > > > >>> Thanks,
> > > > >>> Eric
> > > > >>>
> > > > >>>          [[alternative HTML version deleted]]
> > > > >>>
> > > > >>> ______________________________________________
> > > > >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > > > >>> PLEASE do read the posting guide
> > > > >> http://www.R-project.org/posting-guide.html
> > > > >>> and provide commented, minimal, self-contained, reproducible code.
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Statistics & Software Consulting
> > > > >> GKX Group, GKX Associates Inc.
> > > > >> tel: 1-877-GKX-GROUP
> > > > >> email: ggrothendieck at gmail.com
> > > > >>
> > > > >
> > > > >       [[alternative HTML version deleted]]
> > > > >
> > > > > ______________________________________________
> > > > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > > > and provide commented, minimal, self-contained, reproducible code.
> > > > >
> > > >
> > >
> > >
> > > --
> > > Statistics & Software Consulting
> > > GKX Group, GKX Associates Inc.
> > > tel: 1-877-GKX-GROUP
> > > email: ggrothendieck at gmail.com
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Joshua Ulrich  |  about.me/joshuaulrich
> FOSS Trading  |  www.fosstrading.com

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: outer join of xts's

Joshua Ulrich
On Fri, Jan 3, 2020 at 8:20 AM Eric Berger <[hidden email]> wrote:

>
> Hi Joshua,
> Thanks for the comment but I guess I prefer a different behavior. If
> merge.xts() cannot handle the objects I pass in, I want it to fail.
> In fact, I typically adopt a naming convention where the variable name
> indicates the type. In my normal coding style my example would look
> like:
> aXts <- xts(x=rnorm(5),order.by=dtV)
> a1Xts <- aXts[1:3,]
> a2Xts <- aXts[2:4,]
> etc
> Happy to hear your views on why other conventions might be better.
>
What you describe above works for merge.xts(), but my point is more
general. You don't have a guarantee that every method for every class
in every package will behave that way if you pass objects they don't
expect.

For example, you could call merge.xts() on objects that are some new
class that inherits from xts.  That would "work".  But the new class
may have its own merge() method that does something different than
merge.xts(). In that case, the result of merge.xts() could be
malformed in an unexpected way.

Also, methods don't need to be exported.  I would  prefer that
merge.xts(), lag.xts(), etc were not exported.  I have considered
un-exporting them.  The reason I don't is because it would break code
like yours.

I hope that helps you understand my rationale.

Best,
Josh


> Regards,
> Eric
>
>
> On Fri, Jan 3, 2020 at 3:45 PM Joshua Ulrich <[hidden email]> wrote:
> >
> > On Fri, Jan 3, 2020 at 1:14 AM Eric Berger <[hidden email]> wrote:
> > >
> > > Hi Gabor and Duncan,
> > > Thanks for your comments. As Gabor points out, Duncan's suggestion
> > > does not work.
> > > For those interested, here is some minimal reproducible example to illustrate
> > >
> > > library(xts)
> > > dtV <- as.Date("2019-01-01")+1:5
> > > a <- xts(x=rnorm(5),order.by=dtV)
> > > a1 <- a[1:3,]
> > > a2 <- a[2:4,]
> > > a3 <- a[3:5,]
> > > colnames(a1) <- "a1"
> > > colnames(a2) <- "a2"
> > > colnames(a3) <- "a3"
> > > L <- list(a1,a2,a3)
> > > b.outer.1 <- Reduce(merge.xts,L)
> > > b.outer.2 <- do.call(merge.xts,L)
> > > identical(b.outer.1,b.outer.2)
> > > # TRUE
> > > dim(b.outer.1)
> > > # [1] 5 3
> > > b.left.1 <- merge.xts( merge.xts(a1,a2,join="left"), a3, join="left" )
> > > f <- function(x,y) { merge.xts(x,y,join="left")}
> > > b.left.2 <- Reduce(f,L)
> > > identical(b.left.1,b.left.2)
> > > # TRUE
> > > dim(b.left.1)
> > > # [1] 3 3
> > > b.left.3 <- do.call("merge",c(L,join="left"))
> > > # Warning message:
> > > # In merge.xts(c(0.316095105296857, -1.69318390538755, -1.16430042971811 :
> > > #  'join' only applicable to two object merges
> > > dim(b.left.3)
> > > # [1] 5 3
> > > identical(b.outer.1,b.left.3)
> > > # TRUE
> > >
> > It's good practice to call generics and let dispatch determine what
> > method to call, instead of calling the method directly.  There's no
> > guarantee that merge.xts() will handle any objects you may
> > accidentally pass to it. So you should replace all your merge.xts()
> > calls with merge().
> >
> > >
> > >
> > > On Thu, Jan 2, 2020 at 11:39 PM Gabor Grothendieck
> > > <[hidden email]> wrote:
> > > >
> > > > join = "left" only applies with merge.xts if there are two objects.
> > > > If there are more it acts the same as join = TRUE..
> > > > See the Details section of ?merge.xts
> > > >
> > > > On Thu, Jan 2, 2020 at 1:29 PM Duncan Murdoch <[hidden email]> wrote:
> > > > >
> > > > > On 02/01/2020 9:31 a.m., Eric Berger wrote:
> > > > > > Hi Gabor,
> > > > > > This is great, thanks. It brought the time down to about 4 seconds.
> > > > > > The command
> > > > > > do.call("merge.xts",L)
> > > > > > also works in this case.
> > > > > > Suppose that instead of the default "outer" join I wanted to use, say, a
> > > > > > "left" join.
> > > > > > Is that possible? I tried a few ways of adding the
> > > > > > join="left"
> > > > > > parameter to the do.call() command but I could not get the syntax to work
> > > > > > (assuming it's even possible).
> > > > >
> > > > > This should work:
> > > > >
> > > > >    do.call("merge", c(L, join = "left"))
> > > > >
> > > > > The second argument to do.call is a list which becomes the arguments to
> > > > > the function being called.  Your time series should be unnamed entries
> > > > > in the list, while other arguments to merge() should be named.
> > > > >
> > > > > Duncan Murdoch
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > Eric
> > > > > >
> > > > > >
> > > > > > On Thu, Jan 2, 2020 at 3:23 PM Gabor Grothendieck <[hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > >> You don't need Reduce as xts already supports mutliway merges.  This
> > > > > >> perfroms one
> > > > > >> multiway merge rather than  k-1 two way merges.
> > > > > >>
> > > > > >>      do.call("merge", L)
> > > > > >>
> > > > > >> On Thu, Jan 2, 2020 at 6:13 AM Eric Berger <[hidden email]> wrote:
> > > > > >>>
> > > > > >>> Hi,
> > > > > >>> I have a list L of about 2,600 xts's.
> > > > > >>> Each xts has a single numeric column. About 90% of the xts's have
> > > > > >>> approximately 500 rows, and the rest have fewer than 500 rows.
> > > > > >>> I create a single xts using the command
> > > > > >>>
> > > > > >>> myXts <- Reduce( merge.xts, L )
> > > > > >>>
> > > > > >>> By default, merge.xts() does an outer join (which is what I want).
> > > > > >>>
> > > > > >>> The command takes about 80 seconds to complete.
> > > > > >>> I have plenty of RAM on my computer.
> > > > > >>>
> > > > > >>> Are there faster ways to accomplish this task?
> > > > > >>>
> > > > > >>> Thanks,
> > > > > >>> Eric
> > > > > >>>
> > > > > >>>          [[alternative HTML version deleted]]
> > > > > >>>
> > > > > >>> ______________________________________________
> > > > > >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > > > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > >>> PLEASE do read the posting guide
> > > > > >> http://www.R-project.org/posting-guide.html
> > > > > >>> and provide commented, minimal, self-contained, reproducible code.
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >> Statistics & Software Consulting
> > > > > >> GKX Group, GKX Associates Inc.
> > > > > >> tel: 1-877-GKX-GROUP
> > > > > >> email: ggrothendieck at gmail.com
> > > > > >>
> > > > > >
> > > > > >       [[alternative HTML version deleted]]
> > > > > >
> > > > > > ______________________________________________
> > > > > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > > > > and provide commented, minimal, self-contained, reproducible code.
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Statistics & Software Consulting
> > > > GKX Group, GKX Associates Inc.
> > > > tel: 1-877-GKX-GROUP
> > > > email: ggrothendieck at gmail.com
> > >
> > > ______________________________________________
> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> > --
> > Joshua Ulrich  |  about.me/joshuaulrich
> > FOSS Trading  |  www.fosstrading.com



--
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.