removing repeating values from xts series

9 messages
Open this post in threaded view
|

removing repeating values from xts series

 Hi fellows, I am facing a case that I cannot solve with my limited knowledge of R, unless I write the function myself - which I would like to avoid (reusing is better than reinventing the wheel). Following the relevant information. Input scenario: An xts time series object with duplicates, the object contains bid, bid volume, ask, ask volume. Example: 01-01-2010 09:00:01 100 1 101 1 01-01-2010 09:00:02 100 1 101 1 01-01-2010 09:00:03 100 1 101 1 01-01-2010 09:00:04 101 1 102 1 01-01-2010 09:00:05 102 1 102 1 01-01-2010 09:00:06 100 1 101 1 ... Goal: A timeseries with only non-repeating values, removing the duplicates in between the values. I tried "unique" already, but that one returns only the unique values from within the whole timeseries and not on a running base. Example code: The following example code exemplifies with a non-xts series what I want to achieve ... > y = c(1,1,2,2,1,1,1,2,3,4,3,3,3,3,3,1) > removeDuplicates <- function(input) {         index = 2         ret = c(input[1])         for(i in 2:length(input))         {                 if(input[i]!=input[i-1])                 {                         ret[index] = input[i]                         index = index + 1                 }         }         ret } > > removeDuplicates(y) [1] 1 2 1 2 3 4 3 1 > How can I make this with an xts series? Is there a function for this? Thanks in advance, with kind regards, Ulrich -- Ulrich Staudinger activequant.org _______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance-- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
Open this post in threaded view
|

Re: removing repeating values from xts series

 Ulrich, try duplicated(xts.object, ...) or possibly duplicated(as.data.frame(xts.object), ...) if all columns should be considered. Regards, david -----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of Ulrich Staudinger Sent: Wednesday, September 15, 2010 8:28 AM To: r-sig-finance Subject: [R-SIG-Finance] removing repeating values from xts series Hi fellows, I am facing a case that I cannot solve with my limited knowledge of R, unless I write the function myself - which I would like to avoid (reusing is better than reinventing the wheel). Following the relevant information. Input scenario: An xts time series object with duplicates, the object contains bid, bid volume, ask, ask volume. Example: 01-01-2010 09:00:01 100 1 101 1 01-01-2010 09:00:02 100 1 101 1 01-01-2010 09:00:03 100 1 101 1 01-01-2010 09:00:04 101 1 102 1 01-01-2010 09:00:05 102 1 102 1 01-01-2010 09:00:06 100 1 101 1 ... Goal: A timeseries with only non-repeating values, removing the duplicates in between the values. I tried "unique" already, but that one returns only the unique values from within the whole timeseries and not on a running base. Example code: The following example code exemplifies with a non-xts series what I want to achieve ... > y = c(1,1,2,2,1,1,1,2,3,4,3,3,3,3,3,1) > removeDuplicates <- function(input) {         index = 2         ret = c(input[1])         for(i in 2:length(input))         {                 if(input[i]!=input[i-1])                 {                         ret[index] = input[i]                         index = index + 1                 }         }         ret } > > removeDuplicates(y) [1] 1 2 1 2 3 4 3 1 > How can I make this with an xts series? Is there a function for this? Thanks in advance, with kind regards, Ulrich -- Ulrich Staudinger activequant.org _______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance-- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go. _______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance-- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
Open this post in threaded view
|

Re: removing repeating values from xts series

 Hi David, as far as I understand, duplicated works from the inner workings very much like unique. With a vector y (in this case no timeseries), duplicated yields: > y [1] 1 1 2 3 2 2 2 2 1 > duplicated(y) [1] FALSE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE But what I would like to have is: FALSE TRUE FALSE FALSE FALSE TRUE TRUE TRUE FALSE or ... 1 2 3 2 1 I am not so sure that duplicated is what I want, unless I didn't spot something ... some other approach maybe? Regards, Ulrich On Wed, Sep 15, 2010 at 9:08 AM, Lüthi David (XICD 1) <[hidden email]> wrote: > Ulrich, > try duplicated(xts.object, ...) or possibly duplicated(as.data.frame(xts.object), ...) if all columns should be considered. > Regards, david > > -----Original Message----- > From: [hidden email] [mailto:[hidden email]] On Behalf Of Ulrich Staudinger > Sent: Wednesday, September 15, 2010 8:28 AM > To: r-sig-finance > Subject: [R-SIG-Finance] removing repeating values from xts series > > Hi fellows, > > I am facing a case that I cannot solve with my limited knowledge of R, > unless I write the function myself - which I would like to avoid > (reusing is better than reinventing the wheel). Following the relevant > information. > > Input scenario: > An xts time series object with duplicates, the object contains bid, > bid volume, ask, ask volume. > Example: > 01-01-2010 09:00:01     100     1       101     1 > 01-01-2010 09:00:02     100     1       101     1 > 01-01-2010 09:00:03     100     1       101     1 > 01-01-2010 09:00:04     101     1       102     1 > 01-01-2010 09:00:05     102     1       102     1 > 01-01-2010 09:00:06     100     1       101     1 > ... > > Goal: > A timeseries with only non-repeating values, removing the duplicates > in between the values. > > I tried "unique" already, but that one returns only the unique values > from within the whole timeseries and not on a running base. > > > Example code: > The following example code exemplifies with a non-xts series what I > want to achieve ... >> y = c(1,1,2,2,1,1,1,2,3,4,3,3,3,3,3,1) >> removeDuplicates <- function(input) > { >        index = 2 >        ret = c(input[1]) >        for(i in 2:length(input)) >        { >                if(input[i]!=input[i-1]) >                { >                        ret[index] = input[i] >                        index = index + 1 >                } >        } >        ret > } >> >> removeDuplicates(y) > [1] 1 2 1 2 3 4 3 1 >> > > > > How can I make this with an xts series? Is there a function for this? > > Thanks in advance, > with kind regards, > Ulrich > > -- > Ulrich Staudinger > activequant.org > > _______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-sig-finance> -- Subscriber-posting only. If you want to post, subscribe first. > -- Also note that this is not the r-help list where general R questions should go. > -- Ulrich Staudinger activequant.org _______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance-- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
Open this post in threaded view
|

Re: removing repeating values from xts series

 Hi Ulrich I see. Ad hoc I'd use rle (run length encoding) and some function of cumsum(rle(y)\$lengths) to get indexes of non-duplicates. Regards, david -----Original Message----- From: Ulrich Staudinger [mailto:[hidden email]] Sent: Wednesday, September 15, 2010 9:25 AM To: Lüthi David (XICD 1) Cc: r-sig-finance Subject: Re: [R-SIG-Finance] removing repeating values from xts series Hi David, as far as I understand, duplicated works from the inner workings very much like unique. With a vector y (in this case no timeseries), duplicated yields: > y [1] 1 1 2 3 2 2 2 2 1 > duplicated(y) [1] FALSE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE But what I would like to have is: FALSE TRUE FALSE FALSE FALSE TRUE TRUE TRUE FALSE or ... 1 2 3 2 1 I am not so sure that duplicated is what I want, unless I didn't spot something ... some other approach maybe? Regards, Ulrich On Wed, Sep 15, 2010 at 9:08 AM, Lüthi David (XICD 1) <[hidden email]> wrote: > Ulrich, > try duplicated(xts.object, ...) or possibly duplicated(as.data.frame(xts.object), ...) if all columns should be considered. > Regards, david > > -----Original Message----- > From: [hidden email] [mailto:[hidden email]] On Behalf Of Ulrich Staudinger > Sent: Wednesday, September 15, 2010 8:28 AM > To: r-sig-finance > Subject: [R-SIG-Finance] removing repeating values from xts series > > Hi fellows, > > I am facing a case that I cannot solve with my limited knowledge of R, > unless I write the function myself - which I would like to avoid > (reusing is better than reinventing the wheel). Following the relevant > information. > > Input scenario: > An xts time series object with duplicates, the object contains bid, > bid volume, ask, ask volume. > Example: > 01-01-2010 09:00:01     100     1       101     1 > 01-01-2010 09:00:02     100     1       101     1 > 01-01-2010 09:00:03     100     1       101     1 > 01-01-2010 09:00:04     101     1       102     1 > 01-01-2010 09:00:05     102     1       102     1 > 01-01-2010 09:00:06     100     1       101     1 > ... > > Goal: > A timeseries with only non-repeating values, removing the duplicates > in between the values. > > I tried "unique" already, but that one returns only the unique values > from within the whole timeseries and not on a running base. > > > Example code: > The following example code exemplifies with a non-xts series what I > want to achieve ... >> y = c(1,1,2,2,1,1,1,2,3,4,3,3,3,3,3,1) >> removeDuplicates <- function(input) > { >        index = 2 >        ret = c(input[1]) >        for(i in 2:length(input)) >        { >                if(input[i]!=input[i-1]) >                { >                        ret[index] = input[i] >                        index = index + 1 >                } >        } >        ret > } >> >> removeDuplicates(y) > [1] 1 2 1 2 3 4 3 1 >> > > > > How can I make this with an xts series? Is there a function for this? > > Thanks in advance, > with kind regards, > Ulrich > > -- > Ulrich Staudinger > activequant.org > > _______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-sig-finance> -- Subscriber-posting only. If you want to post, subscribe first. > -- Also note that this is not the r-help list where general R questions should go. > -- Ulrich Staudinger activequant.org _______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance-- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
Open this post in threaded view
|

Re: removing repeating values from xts series

 In reply to this post by Ulrich Staudinger-2 So you want to compare y[-1,] with y[-nrow(y),] I think.  And save the rows that aren't all equal.  Yes? On 15/09/2010 08:25, Ulrich Staudinger wrote: > Hi David, > > as far as I understand, duplicated works from the inner workings very > much like unique. > > With a vector y (in this case no timeseries), duplicated yields: >> y > [1] 1 1 2 3 2 2 2 2 1 >> duplicated(y) > [1] FALSE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE > > > But what I would like to have is: > FALSE TRUE FALSE FALSE FALSE TRUE TRUE TRUE FALSE > or ... > 1 2 3 2 1 > > > I am not so sure that duplicated is what I want, unless I didn't spot > something ... some other approach maybe? > > > Regards, > Ulrich > > > > > On Wed, Sep 15, 2010 at 9:08 AM, Lüthi David (XICD 1) > <[hidden email]>  wrote: >> Ulrich, >> try duplicated(xts.object, ...) or possibly duplicated(as.data.frame(xts.object), ...) if all columns should be considered. >> Regards, david >> >> -----Original Message----- >> From: [hidden email] [mailto:[hidden email]] On Behalf Of Ulrich Staudinger >> Sent: Wednesday, September 15, 2010 8:28 AM >> To: r-sig-finance >> Subject: [R-SIG-Finance] removing repeating values from xts series >> >> Hi fellows, >> >> I am facing a case that I cannot solve with my limited knowledge of R, >> unless I write the function myself - which I would like to avoid >> (reusing is better than reinventing the wheel). Following the relevant >> information. >> >> Input scenario: >> An xts time series object with duplicates, the object contains bid, >> bid volume, ask, ask volume. >> Example: >> 01-01-2010 09:00:01     100     1       101     1 >> 01-01-2010 09:00:02     100     1       101     1 >> 01-01-2010 09:00:03     100     1       101     1 >> 01-01-2010 09:00:04     101     1       102     1 >> 01-01-2010 09:00:05     102     1       102     1 >> 01-01-2010 09:00:06     100     1       101     1 >> ... >> >> Goal: >> A timeseries with only non-repeating values, removing the duplicates >> in between the values. >> >> I tried "unique" already, but that one returns only the unique values >> from within the whole timeseries and not on a running base. >> >> >> Example code: >> The following example code exemplifies with a non-xts series what I >> want to achieve ... >>> y = c(1,1,2,2,1,1,1,2,3,4,3,3,3,3,3,1) >>> removeDuplicates<- function(input) >> { >>         index = 2 >>         ret = c(input[1]) >>         for(i in 2:length(input)) >>         { >>                 if(input[i]!=input[i-1]) >>                 { >>                         ret[index] = input[i] >>                         index = index + 1 >>                 } >>         } >>         ret >> } >>> >>> removeDuplicates(y) >> [1] 1 2 1 2 3 4 3 1 >>> >> >> >> >> How can I make this with an xts series? Is there a function for this? >> >> Thanks in advance, >> with kind regards, >> Ulrich >> >> -- >> Ulrich Staudinger >> activequant.org >> >> _______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-sig-finance>> -- Subscriber-posting only. If you want to post, subscribe first. >> -- Also note that this is not the r-help list where general R questions should go. >> > > > -- Patrick Burns [hidden email] http://www.burns-stat.comhttp://www.portfolioprobe.com/blog_______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance-- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
Open this post in threaded view
|

Re: removing repeating values from xts series

 I want to compare y(t) with y(t-1) where t = 2... length(y) y is an xts timeseries On Wed, Sep 15, 2010 at 9:33 AM, Patrick Burns <[hidden email]> wrote: > So you want to compare > > y[-1,] > > with > > y[-nrow(y),] > > I think.  And save the rows > that aren't all equal.  Yes? > -- Ulrich Staudinger activequant.org _______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance-- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
Open this post in threaded view
|

Re: removing repeating values from xts series

 I think diff and a logical operation on all four colums would help. I hoped I would find a ready function for ... Thanks ... On Wed, Sep 15, 2010 at 9:46 AM, Ulrich Staudinger <[hidden email]> wrote: > I want to compare > > y(t) with y(t-1) > where > t = 2... length(y) > y is an xts timeseries > > > > On Wed, Sep 15, 2010 at 9:33 AM, Patrick Burns <[hidden email]> wrote: >> So you want to compare >> >> y[-1,] >> >> with >> >> y[-nrow(y),] >> >> I think.  And save the rows >> that aren't all equal.  Yes? >> > > > > > -- > Ulrich Staudinger > activequant.org > -- Ulrich Staudinger [hidden email] http://www.activequant.org_______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance-- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.