running count in data.frame

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

running count in data.frame

Mark Knecht
Hi,
   I need to keep a running count of events that have happened in my
data.frame. I found a document called usingR that had an example of
doing this for random coin flips and I tried to modify it. It seems to
sort of work in the beginning, but then it stops and I don't
understand why. I'm trying to duplicate essentially the Excel
capability of =SUM($A$1:$A(Row number))

   The example looked like this:

x = cumsum(sample(c(-1,1),100,replace=T))

which does seem to work: (100 shortened to 20 for email)

> cumsum(sample(c(-1,1),20,replace=T))
 [1] 1 0 1 0 1 2 3 4 5 4 3 4 5 6 7 6 5 4 5 6
> cumsum(sample(c(-1,1),20,replace=T))
 [1] 1 2 1 2 1 2 3 2 3 4 5 6 7 8 7 8 7 8 9 8
> cumsum(sample(c(-1,1),20,replace=T))
 [1] 1 0 1 0 1 0 1 2 3 4 5 6 7 8 7 8 7 6 7 8
> cumsum(sample(c(-1,1),20,replace=T))
 [1]  1  0  1  0  1  0 -1  0  1  0  1  0  1  2  1  0  1  2  3  4
> cumsum(sample(c(-1,1),20,replace=T))
 [1]  1  2  1  0 -1  0 -1 -2 -1 -2 -1 -2 -1 -2 -3 -2 -3 -4 -5 -6

However that example doesn't have to read from the data.frame so I
tried to leverage on some earlier help today but it isn't working for
me. The goal is the MyFrame$lc keeps a running total of events in the
MyFrame$l column, and likewise for $pc and $p. It seems that $lc
starts off OK until it gets to a 0 and then resets back to 0 which I
don't want. The $pc counter never seems to count. I also get a warning
message I don't understand so clearly I'm doing something very wrong
here:

> F1 <- RunningCount(F1)
Warning messages:
1: In MyFrame$pc[pos] <- cumsum(as.integer(pos)) :
  number of items to replace is not a multiple of replacement length
2: In MyFrame$lc[pos] <- cumsum(as.integer(pos)) :
  number of items to replace is not a multiple of replacement length
> F1
    x  y p  l pc lc
1   1 -4 0 -4  0  1
2   2 -3 0 -3  0  2
3   3 -2 0 -2  0  3
4   4 -1 0 -1  0  4
5   5  0 0  0  0  0
6   6  1 1  0  0  0
7   7  2 2  0  0  0
8   8  3 3  0  0  0
9   9  4 4  0  0  0
10 10  5 5  0  0  0
>

I wanted $lc to go up to 4 and then hold 4 until the end. $pc should
have stays 0 until line 6 and then gone up to 5 at the end.

Any and all inputs appreciated on what I'm doing wrong.

Thanks,
Mark





AddCols = function (MyFrame) {
        MyFrame$p<-0
        MyFrame$l<-0
        MyFrame$pc<-0
        MyFrame$lc<-0
        return(MyFrame)
}

BinPosNeg = function (MyFrame) {

## Positive y in p column, negative y in l column
        pos <- MyFrame$y > 0
        MyFrame$p[pos] <- MyFrame$y[pos]
        MyFrame$l[!pos] <- MyFrame$y[!pos]
        return(MyFrame)
}

RunningCount = function (MyFrame) {
## Running count of p & l events

        pos <- (MyFrame$p > 0)
        MyFrame$pc[pos] <- cumsum(as.integer(pos))
        pos <- (MyFrame$l < 0)
        MyFrame$lc[pos] <- cumsum(as.integer(pos))

        return(MyFrame)
}

F1 <- data.frame(x=1:10, y=-4:5)
F1 <- AddCols(F1)
F1
F1 <- BinPosNeg(F1)
F1
F1 <- RunningCount(F1)
F1

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
markknecht@gmail.com
Reply | Threaded
Open this post in threaded view
|

Re: running count in data.frame

jholtman
Not exactly sure what you want to count.  Does this do what you want (made a
change in RunningCount)

> AddCols = function (MyFrame) {
+        MyFrame$p<-0
+        MyFrame$l<-0
+        MyFrame$pc<-0
+        MyFrame$lc<-0
+        return(MyFrame)
+ }
>
> BinPosNeg = function (MyFrame) {
+
+ ## Positive y in p column, negative y in l column
+        pos <- MyFrame$y > 0
+        MyFrame$p[pos] <- MyFrame$y[pos]
+        MyFrame$l[!pos] <- MyFrame$y[!pos]
+        return(MyFrame)
+ }
>
> RunningCount = function (MyFrame) {
+ ## Running count of p & l events
+
+        pos <- (MyFrame$p > 0)
+        MyFrame$pc <- cumsum(as.integer(pos))
+        pos <- (MyFrame$l < 0)
+        MyFrame$lc <- cumsum(as.integer(pos))
+
+        return(MyFrame)
+ }
>
> F1 <- data.frame(x=1:10, y=-4:5)
> F1 <- AddCols(F1)
> F1
    x  y p l pc lc
1   1 -4 0 0  0  0
2   2 -3 0 0  0  0
3   3 -2 0 0  0  0
4   4 -1 0 0  0  0
5   5  0 0 0  0  0
6   6  1 0 0  0  0
7   7  2 0 0  0  0
8   8  3 0 0  0  0
9   9  4 0 0  0  0
10 10  5 0 0  0  0
> F1 <- BinPosNeg(F1)
> F1
    x  y p  l pc lc
1   1 -4 0 -4  0  0
2   2 -3 0 -3  0  0
3   3 -2 0 -2  0  0
4   4 -1 0 -1  0  0
5   5  0 0  0  0  0
6   6  1 1  0  0  0
7   7  2 2  0  0  0
8   8  3 3  0  0  0
9   9  4 4  0  0  0
10 10  5 5  0  0  0
> F1 <- RunningCount(F1)
> F1
    x  y p  l pc lc
1   1 -4 0 -4  0  1
2   2 -3 0 -3  0  2
3   3 -2 0 -2  0  3
4   4 -1 0 -1  0  4
5   5  0 0  0  0  4
6   6  1 1  0  1  4
7   7  2 2  0  2  4
8   8  3 3  0  3  4
9   9  4 4  0  4  4
10 10  5 5  0  5  4
>
>


On Tue, Jun 30, 2009 at 10:49 PM, Mark Knecht <[hidden email]> wrote:

> Hi,
>   I need to keep a running count of events that have happened in my
> data.frame. I found a document called usingR that had an example of
> doing this for random coin flips and I tried to modify it. It seems to
> sort of work in the beginning, but then it stops and I don't
> understand why. I'm trying to duplicate essentially the Excel
> capability of =SUM($A$1:$A(Row number))
>
>   The example looked like this:
>
> x = cumsum(sample(c(-1,1),100,replace=T))
>
> which does seem to work: (100 shortened to 20 for email)
>
> > cumsum(sample(c(-1,1),20,replace=T))
>  [1] 1 0 1 0 1 2 3 4 5 4 3 4 5 6 7 6 5 4 5 6
> > cumsum(sample(c(-1,1),20,replace=T))
>  [1] 1 2 1 2 1 2 3 2 3 4 5 6 7 8 7 8 7 8 9 8
> > cumsum(sample(c(-1,1),20,replace=T))
>  [1] 1 0 1 0 1 0 1 2 3 4 5 6 7 8 7 8 7 6 7 8
> > cumsum(sample(c(-1,1),20,replace=T))
>  [1]  1  0  1  0  1  0 -1  0  1  0  1  0  1  2  1  0  1  2  3  4
> > cumsum(sample(c(-1,1),20,replace=T))
>  [1]  1  2  1  0 -1  0 -1 -2 -1 -2 -1 -2 -1 -2 -3 -2 -3 -4 -5 -6
>
> However that example doesn't have to read from the data.frame so I
> tried to leverage on some earlier help today but it isn't working for
> me. The goal is the MyFrame$lc keeps a running total of events in the
> MyFrame$l column, and likewise for $pc and $p. It seems that $lc
> starts off OK until it gets to a 0 and then resets back to 0 which I
> don't want. The $pc counter never seems to count. I also get a warning
> message I don't understand so clearly I'm doing something very wrong
> here:
>
> > F1 <- RunningCount(F1)
> Warning messages:
> 1: In MyFrame$pc[pos] <- cumsum(as.integer(pos)) :
>  number of items to replace is not a multiple of replacement length
> 2: In MyFrame$lc[pos] <- cumsum(as.integer(pos)) :
>  number of items to replace is not a multiple of replacement length
> > F1
>    x  y p  l pc lc
> 1   1 -4 0 -4  0  1
> 2   2 -3 0 -3  0  2
> 3   3 -2 0 -2  0  3
> 4   4 -1 0 -1  0  4
> 5   5  0 0  0  0  0
> 6   6  1 1  0  0  0
> 7   7  2 2  0  0  0
> 8   8  3 3  0  0  0
> 9   9  4 4  0  0  0
> 10 10  5 5  0  0  0
> >
>
> I wanted $lc to go up to 4 and then hold 4 until the end. $pc should
> have stays 0 until line 6 and then gone up to 5 at the end.
>
> Any and all inputs appreciated on what I'm doing wrong.
>
> Thanks,
> Mark
>
>
>
>
>
> AddCols = function (MyFrame) {
>        MyFrame$p<-0
>        MyFrame$l<-0
>        MyFrame$pc<-0
>        MyFrame$lc<-0
>        return(MyFrame)
> }
>
> BinPosNeg = function (MyFrame) {
>
> ## Positive y in p column, negative y in l column
>        pos <- MyFrame$y > 0
>        MyFrame$p[pos] <- MyFrame$y[pos]
>        MyFrame$l[!pos] <- MyFrame$y[!pos]
>        return(MyFrame)
> }
>
> RunningCount = function (MyFrame) {
> ## Running count of p & l events
>
>        pos <- (MyFrame$p > 0)
>        MyFrame$pc[pos] <- cumsum(as.integer(pos))
>        pos <- (MyFrame$l < 0)
>        MyFrame$lc[pos] <- cumsum(as.integer(pos))
>
>        return(MyFrame)
> }
>
> F1 <- data.frame(x=1:10, y=-4:5)
> F1 <- AddCols(F1)
> F1
> F1 <- BinPosNeg(F1)
> F1
> F1 <- RunningCount(F1)
> F1
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: running count in data.frame

Mark Knecht
Yes Jim. Thanks. That's what I was looking for. My mistake letting [pos] block.

Cheers,
Mark

On Tue, Jun 30, 2009 at 8:04 PM, jim holtman<[hidden email]> wrote:
> Not exactly sure what you want to count.  Does this do what you want (made a
> change in RunningCount)
>
<SNIP>
>> RunningCount = function (MyFrame) {
> + ## Running count of p & l events
> +
> +        pos <- (MyFrame$p > 0)
> +        MyFrame$pc <- cumsum(as.integer(pos))
> +        pos <- (MyFrame$l < 0)
> +        MyFrame$lc <- cumsum(as.integer(pos))
> +
<SNIP>

>> F1 <- RunningCount(F1)
>> F1
>     x  y p  l pc lc
> 1   1 -4 0 -4  0  1
> 2   2 -3 0 -3  0  2
> 3   3 -2 0 -2  0  3
> 4   4 -1 0 -1  0  4
> 5   5  0 0  0  0  4
> 6   6  1 1  0  1  4
> 7   7  2 2  0  2  4
> 8   8  3 3  0  3  4
> 9   9  4 4  0  4  4
> 10 10  5 5  0  5  4
<SNIP>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
markknecht@gmail.com