removing repeating values from xts series

removing repeating values from xts series

 Hi fellows, I am facing a case that I cannot solve with my limited knowledge of R, unless I write the function myself - which I would like to avoid (reusing is better than reinventing the wheel). Following the relevant information. Input scenario: An xts time series object with duplicates, the object contains bid, bid volume, ask, ask volume. Example: 01-01-2010 09:00:01 100 1 101 1 01-01-2010 09:00:02 100 1 101 1 01-01-2010 09:00:03 100 1 101 1 01-01-2010 09:00:04 101 1 102 1 01-01-2010 09:00:05 102 1 102 1 01-01-2010 09:00:06 100 1 101 1 ... Goal: A timeseries with only non-repeating values, removing the duplicates in between the values. I tried "unique" already, but that one returns only the unique values from within the whole timeseries and not on a running base. Example code: The following example code exemplifies with a non-xts series what I want to achieve ... > y = c(1,1,2,2,1,1,1,2,3,4,3,3,3,3,3,1) > removeDuplicates <- function(input) {         index = 2         ret = c(input[1])         for(i in 2:length(input))         {                 if(input[i]!=input[i-1])                 {                         ret[index] = input[i]                         index = index + 1                 }         }         ret } > > removeDuplicates(y) [1] 1 2 1 2 3 4 3 1 > How can I make this with an xts series? Is there a function for this? Thanks in advance, with kind regards, Ulrich -- Ulrich Staudinger activequant.org
Re: removing repeating values from xts series

 Ulrich, try duplicated(xts.object, ...) or possibly duplicated(as.data.frame(xts.object), ...) if all columns should be considered. Regards, david
Re: removing repeating values from xts series

 Hi David, as far as I understand, duplicated works from the inner workings very much like unique. With a vector y (in this case no timeseries), duplicated yields: > y [1] 1 1 2 3 2 2 2 2 1 > duplicated(y) [1] FALSE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE But what I would like to have is: FALSE TRUE FALSE FALSE FALSE TRUE TRUE TRUE FALSE or ... 1 2 3 2 1 I am not so sure that duplicated is what I want, unless I didn't spot something ... some other approach maybe? Regards, Ulrich
Re: removing repeating values from xts series

 Hi Ulrich I see. Ad hoc I'd use rle (run length encoding) and some function of cumsum(rle(y)$lengths) to get indexes of non-duplicates. Regards, david
Re: removing repeating values from xts series

 So you want to compare y[-1,] with y[-nrow(y),] I think.  And save the rows that aren't all equal.  Yes?

-- Patrick Burns
[hidden email]
http://www.burns-stat.com
http://www.portfolioprobe.com/blog
Re: removing repeating values from xts series

 I want to compare y(t) with y(t-1) where t = 2... length(y) y is an xts timeseries

-- Ulrich Staudinger
activequant.org
Re: removing repeating values from xts series

 I think diff and a logical operation on all four colums would help. I hoped I would find a ready function for ... Thanks ...

-- Ulrich Staudinger
[hidden email]
http://www.activequant.org