re-indexing data under the zoo package

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

re-indexing data under the zoo package

Aidan Corcoran
Dear all,

I was hoping someone could help me to generate an index based on data
in zooreg format. The data are of the form

> head(dq)
        sphsxs eunrfi irhont
1995(1)  670.8   82.9     NA
1995(2)  686.0   82.9     NA
1995(3)  682.6   83.0     NA
1995(4)  692.7   82.7     NA
1996(1)  686.0   81.5   33.6
1996(2)  697.8   82.0   34.6

and I would like to index each of the three variables to 100 in
1996(1). I have made a few failed attempts based on extracting the
values at that date

> dq[index(dq)==1996.00]
        sphsxs eunrfi irhont
1996(1)    686   81.5   33.6

and then trying to divide the series by those values

> dq/dq[index(dq)==1996.00]
     sphsxs eunrfi irhont
1996      1      1      1

but this results in a single row. One option might be to replicate the
1996 row using rep, but

> rep(dq[index(dq)==1996.00],2)
[1] 686.0  81.5  33.6 686.0  81.5  33.6

seems to repeat the data within a single vector, and I'm not sure how
to get it to repeat the row down through a zoo object (and suspect
there might be an easier way).

Any help much appreciated.

thanks
Aidan

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: re-indexing data under the zoo package

Arun.stat
Hi Aiden, is this what you wanted?


library(zoo)
dat <- as.matrix(zooreg(matrix(rnorm(18), 6), start=as.Date("2011-01-01"), frequency=1))
colnames(dat) <- paste("column", 1:3, sep="")
dat
dat1 <- t(apply(dat, 1, function(x) return(x/dat[1,]*100)))
dat1

Thanks,
Reply | Threaded
Open this post in threaded view
|

Re: re-indexing data under the zoo package

Stefan Grosse-2
In reply to this post by Aidan Corcoran
Am 28.02.2011 18:36, schrieb Aidan Corcoran:

>> head(dq)
>         sphsxs eunrfi irhont
> 1995(1)  670.8   82.9     NA
> 1995(2)  686.0   82.9     NA
> 1995(3)  682.6   83.0     NA
> 1995(4)  692.7   82.7     NA
> 1996(1)  686.0   81.5   33.6
> 1996(2)  697.8   82.0   34.6
>

>> dq/dq[index(dq)==1996.00]
>      sphsxs eunrfi irhont
> 1996      1      1      1


dq2<-zooreg(t(t(dq)/as.numeric(dq[index(dq)==1996.00]))*100,start=c(1995,1),frequency=4)

Stefan

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: re-indexing data under the zoo package

Jeffrey Ryan-2
In reply to this post by Aidan Corcoran
Haven't checked the math per se, but this is probably what you want:

z <- zooreg(matrix(rnorm(18),nc=3), start=1995, freq=4)

z[index(z)==1996]
1996(1) -1.274160 0.6255354 0.4150378

t(t(z)/as.numeric(z[index(z)==1996]))
               x.1          x.2        x.3
1995(1)  0.9233051 0.2428291171  0.1960831
1995(2)  0.6585486 1.2370305821 -0.7346616
1995(3) -0.6777132 0.0009999497 -2.6567196
1995(4)  0.3662096 1.3125214482 -0.6704602
1996(1)  1.0000000 1.0000000000  1.0000000
1996(2) -0.2993159 1.9259156026  1.5291669

The trick is zoo (and xts) merge by time first before doing basic Ops.  So
you are essentially left with only one row if you don't get rid of the
'zoo'-ness of one of your objects.

All the t()'s are just to get things looking like they did at the beginning.

Best,
Jeff


On Mon, Feb 28, 2011 at 11:36 AM, Aidan Corcoran <[hidden email]
> wrote:

> Dear all,
>
> I was hoping someone could help me to generate an index based on data
> in zooreg format. The data are of the form
>
> > head(dq)
>        sphsxs eunrfi irhont
> 1995(1)  670.8   82.9     NA
> 1995(2)  686.0   82.9     NA
> 1995(3)  682.6   83.0     NA
> 1995(4)  692.7   82.7     NA
> 1996(1)  686.0   81.5   33.6
> 1996(2)  697.8   82.0   34.6
>
> and I would like to index each of the three variables to 100 in
> 1996(1). I have made a few failed attempts based on extracting the
> values at that date
>
> > dq[index(dq)==1996.00]
>        sphsxs eunrfi irhont
> 1996(1)    686   81.5   33.6
>
> and then trying to divide the series by those values
>
> > dq/dq[index(dq)==1996.00]
>     sphsxs eunrfi irhont
> 1996      1      1      1
>
> but this results in a single row. One option might be to replicate the
> 1996 row using rep, but
>
> > rep(dq[index(dq)==1996.00],2)
> [1] 686.0  81.5  33.6 686.0  81.5  33.6
>
> seems to repeat the data within a single vector, and I'm not sure how
> to get it to repeat the row down through a zoo object (and suspect
> there might be an easier way).
>
> Any help much appreciated.
>
> thanks
> Aidan
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions
> should go.
>



--
Jeffrey Ryan
[hidden email]

www.lemnica.com

        [[alternative HTML version deleted]]

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: re-indexing data under the zoo package

Aidan Corcoran
thanks very much Jeffrey and Stefan, this is just what I wanted. It's
a neat answer (just a pity zoo doesn't provide some utility for this).

cheers
Aidan

On Mon, Feb 28, 2011 at 6:17 PM, Jeffrey Ryan <[hidden email]> wrote:

> Haven't checked the math per se, but this is probably what you want:
> z <- zooreg(matrix(rnorm(18),nc=3), start=1995, freq=4)
> z[index(z)==1996]
> 1996(1) -1.274160 0.6255354 0.4150378
> t(t(z)/as.numeric(z[index(z)==1996]))
>                x.1          x.2        x.3
> 1995(1)  0.9233051 0.2428291171  0.1960831
> 1995(2)  0.6585486 1.2370305821 -0.7346616
> 1995(3) -0.6777132 0.0009999497 -2.6567196
> 1995(4)  0.3662096 1.3125214482 -0.6704602
> 1996(1)  1.0000000 1.0000000000  1.0000000
> 1996(2) -0.2993159 1.9259156026  1.5291669
> The trick is zoo (and xts) merge by time first before doing basic Ops.  So
> you are essentially left with only one row if you don't get rid of the
> 'zoo'-ness of one of your objects.
> All the t()'s are just to get things looking like they did at the beginning.
> Best,
> Jeff
>
> On Mon, Feb 28, 2011 at 11:36 AM, Aidan Corcoran
> <[hidden email]> wrote:
>>
>> Dear all,
>>
>> I was hoping someone could help me to generate an index based on data
>> in zooreg format. The data are of the form
>>
>> > head(dq)
>>        sphsxs eunrfi irhont
>> 1995(1)  670.8   82.9     NA
>> 1995(2)  686.0   82.9     NA
>> 1995(3)  682.6   83.0     NA
>> 1995(4)  692.7   82.7     NA
>> 1996(1)  686.0   81.5   33.6
>> 1996(2)  697.8   82.0   34.6
>>
>> and I would like to index each of the three variables to 100 in
>> 1996(1). I have made a few failed attempts based on extracting the
>> values at that date
>>
>> > dq[index(dq)==1996.00]
>>        sphsxs eunrfi irhont
>> 1996(1)    686   81.5   33.6
>>
>> and then trying to divide the series by those values
>>
>> > dq/dq[index(dq)==1996.00]
>>     sphsxs eunrfi irhont
>> 1996      1      1      1
>>
>> but this results in a single row. One option might be to replicate the
>> 1996 row using rep, but
>>
>> > rep(dq[index(dq)==1996.00],2)
>> [1] 686.0  81.5  33.6 686.0  81.5  33.6
>>
>> seems to repeat the data within a single vector, and I'm not sure how
>> to get it to repeat the row down through a zoo object (and suspect
>> there might be an easier way).
>>
>> Any help much appreciated.
>>
>> thanks
>> Aidan
>>
>> _______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only. If you want to post, subscribe first.
>> -- Also note that this is not the r-help list where general R questions
>> should go.
>
>
>
> --
> Jeffrey Ryan
> [hidden email]
>
> www.lemnica.com
>
>

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: re-indexing data under the zoo package

Gabor Grothendieck
In reply to this post by Aidan Corcoran
On Mon, Feb 28, 2011 at 12:36 PM, Aidan Corcoran
<[hidden email]> wrote:

> Dear all,
>
> I was hoping someone could help me to generate an index based on data
> in zooreg format. The data are of the form
>
>> head(dq)
>        sphsxs eunrfi irhont
> 1995(1)  670.8   82.9     NA
> 1995(2)  686.0   82.9     NA
> 1995(3)  682.6   83.0     NA
> 1995(4)  692.7   82.7     NA
> 1996(1)  686.0   81.5   33.6
> 1996(2)  697.8   82.0   34.6
>
> and I would like to index each of the three variables to 100 in
> 1996(1). I have made a few failed attempts based on extracting the
> values at that date
>
>> dq[index(dq)==1996.00]
>        sphsxs eunrfi irhont
> 1996(1)    686   81.5   33.6
>
> and then trying to divide the series by those values
>
>> dq/dq[index(dq)==1996.00]
>     sphsxs eunrfi irhont
> 1996      1      1      1
>
> but this results in a single row. One option might be to replicate the
> 1996 row using rep, but
>
>> rep(dq[index(dq)==1996.00],2)
> [1] 686.0  81.5  33.6 686.0  81.5  33.6
>
> seems to repeat the data within a single vector, and I'm not sure how
> to get it to repeat the row down through a zoo object (and suspect
> there might be an easier way).
>
> Any help much appreciated.
>

Any of these will refer to the data at 1996:

library(zoo)
# dq <- ... shown at end ...

dq[ I(1996) ]
dq[ "1996" ]
window(dq, 1996, 1996)

1. Here is a slight variation of Arun.stat's solution that will
produce values relative to 1996:

dq100 <- dq
dq100[] <- t(apply(dq, 1, "/", coredata(dq["1996"])))

> dq100
           sphsxs   eunrfi   irhont
1995(1) 0.9778426 1.017178       NA
1995(2) 1.0000000 1.017178       NA
1995(3) 0.9950437 1.018405       NA
1995(4) 1.0097668 1.014724       NA
1996(1) 1.0000000 1.000000 1.000000
1996(2) 1.0172012 1.006135 1.029762

2. Another thing you might consider would be to use "yearqtr" class for dq:

# convert time to yearqtr
dq.yq <- dq
time(dq.yq) <- as.yearqtr(time(dq.ym))

# index relative to 1996 Q1
dq.yq100 <- dq.ym
dq.yq100 <- t(apply(dq.yq, 1, "/", coredata(dq[as.yearqtr("1996 Q1")])))

> dq.yq100
             [,1]     [,2]     [,3]
1995 Q1 0.9778426 1.017178       NA
1995 Q2 1.0000000 1.017178       NA
1995 Q3 0.9950437 1.018405       NA
1995 Q4 1.0097668 1.014724       NA
1996 Q1 1.0000000 1.000000 1.000000
1996 Q2 1.0172012 1.006135 1.029762

3. ts class would work here too since its regularly spaced:

tt <- as.ts(dq)
tt[] <- t(apply(tt, 1, "/", window(tt, 1996, 1996)))
tt

Here is the dq used above:

dq <-
structure(c(670.8, 686, 682.6, 692.7, 686, 697.8, 82.9, 82.9,
83, 82.7, 81.5, 82, NA, NA, NA, NA, 33.6, 34.6), .Dim = c(6L,
3L), .Dimnames = list(NULL, c("sphsxs", "eunrfi", "irhont")), index = c(1995,
1995.25, 1995.5, 1995.75, 1996, 1996.25), class = c("zooreg",
"zoo"), frequency = 4)

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: re-indexing data under the zoo package

Gabor Grothendieck
On Mon, Feb 28, 2011 at 2:38 PM, Gabor Grothendieck
<[hidden email]> wrote:

> On Mon, Feb 28, 2011 at 12:36 PM, Aidan Corcoran
> <[hidden email]> wrote:
>> Dear all,
>>
>> I was hoping someone could help me to generate an index based on data
>> in zooreg format. The data are of the form
>>
>>> head(dq)
>>        sphsxs eunrfi irhont
>> 1995(1)  670.8   82.9     NA
>> 1995(2)  686.0   82.9     NA
>> 1995(3)  682.6   83.0     NA
>> 1995(4)  692.7   82.7     NA
>> 1996(1)  686.0   81.5   33.6
>> 1996(2)  697.8   82.0   34.6
>>
>> and I would like to index each of the three variables to 100 in
>> 1996(1). I have made a few failed attempts based on extracting the
>> values at that date
>>
>>> dq[index(dq)==1996.00]
>>        sphsxs eunrfi irhont
>> 1996(1)    686   81.5   33.6
>>
>> and then trying to divide the series by those values
>>
>>> dq/dq[index(dq)==1996.00]
>>     sphsxs eunrfi irhont
>> 1996      1      1      1
>>
>> but this results in a single row. One option might be to replicate the
>> 1996 row using rep, but
>>
>>> rep(dq[index(dq)==1996.00],2)
>> [1] 686.0  81.5  33.6 686.0  81.5  33.6
>>
>> seems to repeat the data within a single vector, and I'm not sure how
>> to get it to repeat the row down through a zoo object (and suspect
>> there might be an easier way).
>>
>> Any help much appreciated.
>>
>
> Any of these will refer to the data at 1996:
>
> library(zoo)
> # dq <- ... shown at end ...
>
> dq[ I(1996) ]
> dq[ "1996" ]
> window(dq, 1996, 1996)
>
> 1. Here is a slight variation of Arun.stat's solution that will
> produce values relative to 1996:
>
> dq100 <- dq
> dq100[] <- t(apply(dq, 1, "/", coredata(dq["1996"])))
>
>> dq100
>           sphsxs   eunrfi   irhont
> 1995(1) 0.9778426 1.017178       NA
> 1995(2) 1.0000000 1.017178       NA
> 1995(3) 0.9950437 1.018405       NA
> 1995(4) 1.0097668 1.014724       NA
> 1996(1) 1.0000000 1.000000 1.000000
> 1996(2) 1.0172012 1.006135 1.029762
>
> 2. Another thing you might consider would be to use "yearqtr" class for dq:
>
> # convert time to yearqtr
> dq.yq <- dq
> time(dq.yq) <- as.yearqtr(time(dq.ym))
>
> # index relative to 1996 Q1
> dq.yq100 <- dq.ym
> dq.yq100 <- t(apply(dq.yq, 1, "/", coredata(dq[as.yearqtr("1996 Q1")])))
>
>> dq.yq100
>             [,1]     [,2]     [,3]
> 1995 Q1 0.9778426 1.017178       NA
> 1995 Q2 1.0000000 1.017178       NA
> 1995 Q3 0.9950437 1.018405       NA
> 1995 Q4 1.0097668 1.014724       NA
> 1996 Q1 1.0000000 1.000000 1.000000
> 1996 Q2 1.0172012 1.006135 1.029762
>
> 3. ts class would work here too since its regularly spaced:
>
> tt <- as.ts(dq)
> tt[] <- t(apply(tt, 1, "/", window(tt, 1996, 1996)))
> tt
>
> Here is the dq used above:
>
> dq <-
> structure(c(670.8, 686, 682.6, 692.7, 686, 697.8, 82.9, 82.9,
> 83, 82.7, 81.5, 82, NA, NA, NA, NA, 33.6, 34.6), .Dim = c(6L,
> 3L), .Dimnames = list(NULL, c("sphsxs", "eunrfi", "irhont")), index = c(1995,
> 1995.25, 1995.5, 1995.75, 1996, 1996.25), class = c("zooreg",
> "zoo"), frequency = 4)

4. and here is one more solution that is slightly simpler as it avoids
the transposition:

dq2 <- dq
dq2[] <- mapply("/", as.data.frame(dq), coredata(dq["1996"]))

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Reply | Threaded
Open this post in threaded view
|

Re: re-indexing data under the zoo package

Aidan Corcoran
Thanks Gabor for the solution and the general tips. As this is a
common task for me answer no. 4 will save a lot of typing.

Aidan

On Mon, Feb 28, 2011 at 8:59 PM, Gabor Grothendieck
<[hidden email]> wrote:

> On Mon, Feb 28, 2011 at 2:38 PM, Gabor Grothendieck
> <[hidden email]> wrote:
>> On Mon, Feb 28, 2011 at 12:36 PM, Aidan Corcoran
>> <[hidden email]> wrote:
>>> Dear all,
>>>
>>> I was hoping someone could help me to generate an index based on data
>>> in zooreg format. The data are of the form
>>>
>>>> head(dq)
>>>        sphsxs eunrfi irhont
>>> 1995(1)  670.8   82.9     NA
>>> 1995(2)  686.0   82.9     NA
>>> 1995(3)  682.6   83.0     NA
>>> 1995(4)  692.7   82.7     NA
>>> 1996(1)  686.0   81.5   33.6
>>> 1996(2)  697.8   82.0   34.6
>>>
>>> and I would like to index each of the three variables to 100 in
>>> 1996(1). I have made a few failed attempts based on extracting the
>>> values at that date
>>>
>>>> dq[index(dq)==1996.00]
>>>        sphsxs eunrfi irhont
>>> 1996(1)    686   81.5   33.6
>>>
>>> and then trying to divide the series by those values
>>>
>>>> dq/dq[index(dq)==1996.00]
>>>     sphsxs eunrfi irhont
>>> 1996      1      1      1
>>>
>>> but this results in a single row. One option might be to replicate the
>>> 1996 row using rep, but
>>>
>>>> rep(dq[index(dq)==1996.00],2)
>>> [1] 686.0  81.5  33.6 686.0  81.5  33.6
>>>
>>> seems to repeat the data within a single vector, and I'm not sure how
>>> to get it to repeat the row down through a zoo object (and suspect
>>> there might be an easier way).
>>>
>>> Any help much appreciated.
>>>
>>
>> Any of these will refer to the data at 1996:
>>
>> library(zoo)
>> # dq <- ... shown at end ...
>>
>> dq[ I(1996) ]
>> dq[ "1996" ]
>> window(dq, 1996, 1996)
>>
>> 1. Here is a slight variation of Arun.stat's solution that will
>> produce values relative to 1996:
>>
>> dq100 <- dq
>> dq100[] <- t(apply(dq, 1, "/", coredata(dq["1996"])))
>>
>>> dq100
>>           sphsxs   eunrfi   irhont
>> 1995(1) 0.9778426 1.017178       NA
>> 1995(2) 1.0000000 1.017178       NA
>> 1995(3) 0.9950437 1.018405       NA
>> 1995(4) 1.0097668 1.014724       NA
>> 1996(1) 1.0000000 1.000000 1.000000
>> 1996(2) 1.0172012 1.006135 1.029762
>>
>> 2. Another thing you might consider would be to use "yearqtr" class for dq:
>>
>> # convert time to yearqtr
>> dq.yq <- dq
>> time(dq.yq) <- as.yearqtr(time(dq.ym))
>>
>> # index relative to 1996 Q1
>> dq.yq100 <- dq.ym
>> dq.yq100 <- t(apply(dq.yq, 1, "/", coredata(dq[as.yearqtr("1996 Q1")])))
>>
>>> dq.yq100
>>             [,1]     [,2]     [,3]
>> 1995 Q1 0.9778426 1.017178       NA
>> 1995 Q2 1.0000000 1.017178       NA
>> 1995 Q3 0.9950437 1.018405       NA
>> 1995 Q4 1.0097668 1.014724       NA
>> 1996 Q1 1.0000000 1.000000 1.000000
>> 1996 Q2 1.0172012 1.006135 1.029762
>>
>> 3. ts class would work here too since its regularly spaced:
>>
>> tt <- as.ts(dq)
>> tt[] <- t(apply(tt, 1, "/", window(tt, 1996, 1996)))
>> tt
>>
>> Here is the dq used above:
>>
>> dq <-
>> structure(c(670.8, 686, 682.6, 692.7, 686, 697.8, 82.9, 82.9,
>> 83, 82.7, 81.5, 82, NA, NA, NA, NA, 33.6, 34.6), .Dim = c(6L,
>> 3L), .Dimnames = list(NULL, c("sphsxs", "eunrfi", "irhont")), index = c(1995,
>> 1995.25, 1995.5, 1995.75, 1996, 1996.25), class = c("zooreg",
>> "zoo"), frequency = 4)
>
> 4. and here is one more solution that is slightly simpler as it avoids
> the transposition:
>
> dq2 <- dq
> dq2[] <- mapply("/", as.data.frame(dq), coredata(dq["1996"]))
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.