Applying a certain formula to a repeated sample data

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Applying a certain formula to a repeated sample data

Ogbos
Dear List,
I have three data-column data. The data is of the form:
1 8590 12516
2 8641 98143
3 8705 98916
4 8750 89911
5 8685 104835
6 8629 121963
7 8676 77655
1 8577 81081
2 8593 83385
3 8642 112164
4 8708 103684
5 8622 83982
6 8593 75944
7 8600 97036
1 8650 104911
2 8730 114098
3 8731 99421
4 8715 85707
5 8717 81273
6 8739 106462
7 8684 110635
1 8713 105214
2 8771 92456
3 8759 109270
4 8762 99150
5 8730 77306
6 8780 86324
7 8804 90214
1 8797 99894
2 8863 95177
3 8873 95910
4 8827 108511
5 8806 115636
6 8869 85542
7 8854 111018
1 8571 93247
2 8533 85105
3 8553 114725
4 8561 122195
5 8532 100945
6 8560 108552
7 8634 108707
1 8646 117420
2 8633 113823
3 8680 82763
4 8765 121072
5 8756 89835
6 8750 104578
7 8790 88429

I wish to calculate average of the second and third columns based on the
first column for each repeated 7 days. The length of the data is 1442. That
is 206 by 7. So I should arrive at 207 data points for each of the two
columns after calculating the mean of each group 1-7.

I have both tried factor/tapply and aggregate functions but seem not to be
making progress.

Thank you very much for your idea.

Best wishes
Ogbos

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Applying a certain formula to a repeated sample data

Jim Lemon-4
Hi Ogbos,
If we assume that you have a 3 column data frame named oodf, how about:

oodf[,4]<-floor((cumsum(oodf[,1])-1)/28)
col2means<-by(oodf[,2],oodf[,4],mean)
col3means<-by(oodf[,3],oodf[,4],mean)

Jim

On Wed, Nov 28, 2018 at 2:06 PM Ogbos Okike <[hidden email]> wrote:

>
> Dear List,
> I have three data-column data. The data is of the form:
> 1 8590 12516
> 2 8641 98143
> 3 8705 98916
> 4 8750 89911
> 5 8685 104835
> 6 8629 121963
> 7 8676 77655
> 1 8577 81081
> 2 8593 83385
> 3 8642 112164
> 4 8708 103684
> 5 8622 83982
> 6 8593 75944
> 7 8600 97036
> 1 8650 104911
> 2 8730 114098
> 3 8731 99421
> 4 8715 85707
> 5 8717 81273
> 6 8739 106462
> 7 8684 110635
> 1 8713 105214
> 2 8771 92456
> 3 8759 109270
> 4 8762 99150
> 5 8730 77306
> 6 8780 86324
> 7 8804 90214
> 1 8797 99894
> 2 8863 95177
> 3 8873 95910
> 4 8827 108511
> 5 8806 115636
> 6 8869 85542
> 7 8854 111018
> 1 8571 93247
> 2 8533 85105
> 3 8553 114725
> 4 8561 122195
> 5 8532 100945
> 6 8560 108552
> 7 8634 108707
> 1 8646 117420
> 2 8633 113823
> 3 8680 82763
> 4 8765 121072
> 5 8756 89835
> 6 8750 104578
> 7 8790 88429
>
> I wish to calculate average of the second and third columns based on the
> first column for each repeated 7 days. The length of the data is 1442. That
> is 206 by 7. So I should arrive at 207 data points for each of the two
> columns after calculating the mean of each group 1-7.
>
> I have both tried factor/tapply and aggregate functions but seem not to be
> making progress.
>
> Thank you very much for your idea.
>
> Best wishes
> Ogbos
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Applying a certain formula to a repeated sample data: SOLVED

Ogbos
On Wed, Nov 28, 2018 at 4:31 AM Jim Lemon <[hidden email]> wrote:

> Hi Ogbos,
> If we assume that you have a 3 column data frame named oodf, how about:
>
> Dear Jim,

Thank you so much.

The code just made life very easier for me.

best regards
Ogbos



> oodf[,4]<-floor((cumsum(oodf[,1])-1)/28)
> col2means<-by(oodf[,2],oodf[,4],mean)
> col3means<-by(oodf[,3],oodf[,4],mean)
>
> Jim
>
> On Wed, Nov 28, 2018 at 2:06 PM Ogbos Okike <[hidden email]>
> wrote:
> >
> > Dear List,
> > I have three data-column data. The data is of the form:
> > 1 8590 12516
> > 2 8641 98143
> > 3 8705 98916
> > 4 8750 89911
> > 5 8685 104835
> > 6 8629 121963
> > 7 8676 77655
> > 1 8577 81081
> > 2 8593 83385
> > 3 8642 112164
> > 4 8708 103684
> > 5 8622 83982
> > 6 8593 75944
> > 7 8600 97036
> > 1 8650 104911
> > 2 8730 114098
> > 3 8731 99421
> > 4 8715 85707
> > 5 8717 81273
> > 6 8739 106462
> > 7 8684 110635
> > 1 8713 105214
> > 2 8771 92456
> > 3 8759 109270
> > 4 8762 99150
> > 5 8730 77306
> > 6 8780 86324
> > 7 8804 90214
> > 1 8797 99894
> > 2 8863 95177
> > 3 8873 95910
> > 4 8827 108511
> > 5 8806 115636
> > 6 8869 85542
> > 7 8854 111018
> > 1 8571 93247
> > 2 8533 85105
> > 3 8553 114725
> > 4 8561 122195
> > 5 8532 100945
> > 6 8560 108552
> > 7 8634 108707
> > 1 8646 117420
> > 2 8633 113823
> > 3 8680 82763
> > 4 8765 121072
> > 5 8756 89835
> > 6 8750 104578
> > 7 8790 88429
> >
> > I wish to calculate average of the second and third columns based on the
> > first column for each repeated 7 days. The length of the data is 1442.
> That
> > is 206 by 7. So I should arrive at 207 data points for each of the two
> > columns after calculating the mean of each group 1-7.
> >
> > I have both tried factor/tapply and aggregate functions but seem not to
> be
> > making progress.
> >
> > Thank you very much for your idea.
> >
> > Best wishes
> > Ogbos
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Applying a certain formula to a repeated sample data

Ogbos
In reply to this post by Jim Lemon-4
Dear Jim,

I wish also to use the means calculated and apply a certain formula on the
same data frame. In particular, I would like to subtract the means of each
of these seven days from each of the seven days and and divide the outcome
by the same means. If I represent m1 by the means of each seven days in
column 1, and c1 is taken as column 1 data. My formula will be of the form:
aa<-(c1-m1)/m1.

I tried it on the first 7 rows and I have what I am looking for.:
 -0.0089986156
  -0.0031149054
   0.0042685741
   0.0094600831
   0.0019612367
  -0.0044993078
   0.0009229349

But doing it manually will take much time.

Many thanks for going a step further to assist me.

Warmest regards.
Ogbos

On Wed, Nov 28, 2018 at 4:31 AM Jim Lemon <[hidden email]> wrote:

> Hi Ogbos,
> If we assume that you have a 3 column data frame named oodf, how about:
>
> oodf[,4]<-floor((cumsum(oodf[,1])-1)/28)
> col2means<-by(oodf[,2],oodf[,4],mean)
> col3means<-by(oodf[,3],oodf[,4],mean)
>
> Jim
>
> On Wed, Nov 28, 2018 at 2:06 PM Ogbos Okike <[hidden email]>
> wrote:
> >
> > Dear List,
> > I have three data-column data. The data is of the form:
> > 1 8590 12516
> > 2 8641 98143
> > 3 8705 98916
> > 4 8750 89911
> > 5 8685 104835
> > 6 8629 121963
> > 7 8676 77655
> > 1 8577 81081
> > 2 8593 83385
> > 3 8642 112164
> > 4 8708 103684
> > 5 8622 83982
> > 6 8593 75944
> > 7 8600 97036
> > 1 8650 104911
> > 2 8730 114098
> > 3 8731 99421
> > 4 8715 85707
> > 5 8717 81273
> > 6 8739 106462
> > 7 8684 110635
> > 1 8713 105214
> > 2 8771 92456
> > 3 8759 109270
> > 4 8762 99150
> > 5 8730 77306
> > 6 8780 86324
> > 7 8804 90214
> > 1 8797 99894
> > 2 8863 95177
> > 3 8873 95910
> > 4 8827 108511
> > 5 8806 115636
> > 6 8869 85542
> > 7 8854 111018
> > 1 8571 93247
> > 2 8533 85105
> > 3 8553 114725
> > 4 8561 122195
> > 5 8532 100945
> > 6 8560 108552
> > 7 8634 108707
> > 1 8646 117420
> > 2 8633 113823
> > 3 8680 82763
> > 4 8765 121072
> > 5 8756 89835
> > 6 8750 104578
> > 7 8790 88429
> >
> > I wish to calculate average of the second and third columns based on the
> > first column for each repeated 7 days. The length of the data is 1442.
> That
> > is 206 by 7. So I should arrive at 207 data points for each of the two
> > columns after calculating the mean of each group 1-7.
> >
> > I have both tried factor/tapply and aggregate functions but seem not to
> be
> > making progress.
> >
> > Thank you very much for your idea.
> >
> > Best wishes
> > Ogbos
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Applying a certain formula to a repeated sample data

Ogbos
Dear Jim,

I don't think my problem is clear the way I put.

I have been trying to manually apply the formula to some rows.

This is what I have done.
I cut and past some rows from 1-7 and save each with a different file as
shown below:

1 8590 12516
2 8641 98143
3 8705 98916
4 8750 89911
5 8685 104835
6 8629 121963
7 8676 77655


1 8577 81081
2 8593 83385
3 8642 112164
4 8708 103684
5 8622 83982
6 8593 75944
7 8600 97036


1 8650 104911
2 8730 114098
3 8731 99421
4 8715 85707
5 8717 81273
6 8739 106462
7 8684 110635


1 8713 105214
2 8771 92456
3 8759 109270
4 8762 99150
5 8730 77306
6 8780 86324
7 8804 90214


1 8797 99894
2 8863 95177
3 8873 95910
4 8827 108511
5 8806 115636
6 8869 85542
7 8854 111018


1 8571 93247
2 8533 85105
3 8553 114725
4 8561 122195
5 8532 100945
6 8560 108552
7 8634 108707


1 8646 117420
2 8633 113823
3 8680 82763
4 8765 121072
5 8756 89835
6 8750 104578
7 8790 88429

Each of them are then read as:
d1<-read.table("dat1",col.names=c("n","CR","WW"))
d2<-read.table("dat2",col.names=c("n","CR","WW"))
d3<-read.table("dat3",col.names=c("n","CR","WW"))
d4<-read.table("dat4",col.names=c("n","CR","WW"))
d5<-read.table("dat5",col.names=c("n","CR","WW"))
d6<-read.table("dat6",col.names=c("n","CR","WW"))
d7<-read.table("dat7",col.names=c("n","CR","WW"))

And my formula for percentage change applied as follows for column 2:
a1<-((d1$CR-mean(d1$CR))/mean(CR))*100
a2<-((d2$CR-mean(d2$CR))/mean(CR))*100
a3<-((d3$CR-mean(d3$CR))/mean(CR))*100
a4<-((d4$CR-mean(d4$CR))/mean(CR))*100
a5<-((d5$CR-mean(d5$CR))/mean(CR))*100
a6<-((d6$CR-mean(d6$CR))/mean(CR))*100
a7<-((d7$CR-mean(d7$CR))/mean(CR))*100

a1-a7 actually gives percentage change in the data.

Instead of doing this one after the other, can you please give an
indication on how I may apply this formula to the data frame with probably
a code.

Thank you again.

Best
Ogbos

On Wed, Nov 28, 2018 at 5:15 AM Ogbos Okike <[hidden email]>
wrote:

> Dear Jim,
>
> I wish also to use the means calculated and apply a certain formula on
> the  same data frame. In particular, I would like to subtract the means of
> each of these seven days from each of the seven days and and divide the
> outcome by the same means. If I represent m1 by the means of each seven
> days in column 1, and c1 is taken as column 1 data. My formula will be of
> the form:
> aa<-(c1-m1)/m1.
>
> I tried it on the first 7 rows and I have what I am looking for.:
>  -0.0089986156
>   -0.0031149054
>    0.0042685741
>    0.0094600831
>    0.0019612367
>   -0.0044993078
>    0.0009229349
>
> But doing it manually will take much time.
>
> Many thanks for going a step further to assist me.
>
> Warmest regards.
> Ogbos
>
> On Wed, Nov 28, 2018 at 4:31 AM Jim Lemon <[hidden email]> wrote:
>
>> Hi Ogbos,
>> If we assume that you have a 3 column data frame named oodf, how about:
>>
>> oodf[,4]<-floor((cumsum(oodf[,1])-1)/28)
>> col2means<-by(oodf[,2],oodf[,4],mean)
>> col3means<-by(oodf[,3],oodf[,4],mean)
>>
>> Jim
>>
>> On Wed, Nov 28, 2018 at 2:06 PM Ogbos Okike <[hidden email]>
>> wrote:
>> >
>> > Dear List,
>> > I have three data-column data. The data is of the form:
>> > 1 8590 12516
>> > 2 8641 98143
>> > 3 8705 98916
>> > 4 8750 89911
>> > 5 8685 104835
>> > 6 8629 121963
>> > 7 8676 77655
>> > 1 8577 81081
>> > 2 8593 83385
>> > 3 8642 112164
>> > 4 8708 103684
>> > 5 8622 83982
>> > 6 8593 75944
>> > 7 8600 97036
>> > 1 8650 104911
>> > 2 8730 114098
>> > 3 8731 99421
>> > 4 8715 85707
>> > 5 8717 81273
>> > 6 8739 106462
>> > 7 8684 110635
>> > 1 8713 105214
>> > 2 8771 92456
>> > 3 8759 109270
>> > 4 8762 99150
>> > 5 8730 77306
>> > 6 8780 86324
>> > 7 8804 90214
>> > 1 8797 99894
>> > 2 8863 95177
>> > 3 8873 95910
>> > 4 8827 108511
>> > 5 8806 115636
>> > 6 8869 85542
>> > 7 8854 111018
>> > 1 8571 93247
>> > 2 8533 85105
>> > 3 8553 114725
>> > 4 8561 122195
>> > 5 8532 100945
>> > 6 8560 108552
>> > 7 8634 108707
>> > 1 8646 117420
>> > 2 8633 113823
>> > 3 8680 82763
>> > 4 8765 121072
>> > 5 8756 89835
>> > 6 8750 104578
>> > 7 8790 88429
>> >
>> > I wish to calculate average of the second and third columns based on the
>> > first column for each repeated 7 days. The length of the data is 1442.
>> That
>> > is 206 by 7. So I should arrive at 207 data points for each of the two
>> > columns after calculating the mean of each group 1-7.
>> >
>> > I have both tried factor/tapply and aggregate functions but seem not to
>> be
>> > making progress.
>> >
>> > Thank you very much for your idea.
>> >
>> > Best wishes
>> > Ogbos
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Applying a certain formula to a repeated sample data

Jeff Newmiller
Thank you for providing a clarifying example. I think a useful function
for you to get familiar with is the "ave" function. It is kind of like
aggregate except that it works when the operation you want to apply to the
group of elements will returns the same number of elements as were given
to it.

Also, in the future please figure out how to tell gmail to send plain text
to the mailing list instead of HTML. You were lucky this time, but often
HTML email gets horribly mangled as it goes through the mailing list and
gets all the formatting removed.

###############################
dta <- read.table( text =
"n CR WW
1 8590 12516
2 8641 98143
3 8705 98916
4 8750 89911
5 8685 104835
6 8629 121963
7 8676 77655
1 8577 81081
2 8593 83385
3 8642 112164
4 8708 103684
5 8622 83982
6 8593 75944
7 8600 97036
1 8650 104911
2 8730 114098
3 8731 99421
4 8715 85707
5 8717 81273
6 8739 106462
7 8684 110635
1 8713 105214
2 8771 92456
3 8759 109270
4 8762 99150
5 8730 77306
6 8780 86324
7 8804 90214
1 8797 99894
2 8863 95177
3 8873 95910
4 8827 108511
5 8806 115636
6 8869 85542
7 8854 111018
1 8571 93247
2 8533 85105
3 8553 114725
4 8561 122195
5 8532 100945
6 8560 108552
7 8634 108707
1 8646 117420
2 8633 113823
3 8680 82763
4 8765 121072
5 8756 89835
6 8750 104578
7 8790 88429
",header=TRUE)

# one way to make a grouping vector
dta$G <- cumsum( c( 1, diff( dta$n ) < 0 )
)
# your operation
fn <- function( x ) {
   m <- mean( x )
  ( x - m ) / m * 100
}
# your operation, computing for each group
gn <- function( x, g ) {
   ave( x, g, FUN = fn )
}
# do the computations
dta$CRpct <- gn( dta$CR, dta$G )
dta$WWpct <- gn( dta$WW, dta$G )
dta
#>    n   CR     WW G        CRpct       WWpct
#> 1  1 8590  12516 1 -0.899861560 -85.4932369
#> 2  2 8641  98143 1 -0.311490540  13.7533758
#> 3  3 8705  98916 1  0.426857407  14.6493272
#> 4  4 8750  89911 1  0.946008306   4.2120148
#> 5  5 8685 104835 1  0.196123673  21.5097882
#> 6  6 8629 121963 1 -0.449930780  41.3621243
#> 7  7 8676  77655 1  0.092293493  -9.9933934
#> 8  1 8577  81081 2 -0.490594182 -10.9385886
#> 9  2 8593  83385 2 -0.304963951  -8.4078170
#> 10 3 8642 112164 2  0.263528632  23.2037610
#> 11 4 8708 103684 2  1.029253336  13.8891155
#> 12 5 8622  83982 2  0.031490843  -7.7520572
#> 13 6 8593  75944 2 -0.304963951 -16.5811987
#> 14 7 8600  97036 2 -0.223750725   6.5867850
#> 15 1 8650 104911 3 -0.682347538   4.5366096
#> 16 2 8730 114098 3  0.236197225  13.6908244
#> 17 3 8731  99421 3  0.247679034  -0.9337985
#> 18 4 8715  85707 3  0.063970082 -14.5988581
#> 19 5 8717  81273 3  0.086933701 -19.0170347
#> 20 6 8739 106462 3  0.339533510   6.0820746
#> 21 7 8684 110635 3 -0.291966014  10.2401827
#> 22 1 8713 105214 4 -0.534907614  11.6017662
#> 23 2 8771  92456 4  0.127203640  -1.9307991
#> 24 3 8759 109270 4 -0.009784895  15.9040146
#> 25 4 8762  99150 4  0.024462238   5.1696079
#> 26 5 8730  77306 4 -0.340840523 -18.0005879
#> 27 6 8780  86324 4  0.229945042  -8.4350859
#> 28 7 8804  90214 4  0.503922112  -4.3089157
#> 29 1 8797  99894 5 -0.500896767  -1.7465519
#> 30 2 8863  95177 5  0.245600995  -6.3860849
#> 31 3 8873  95910 5  0.358706717  -5.6651229
#> 32 4 8827 108511 5 -0.161579602   6.7289318
#> 33 5 8806 115636 5 -0.399101617  13.7369184
#> 34 6 8869  85542 5  0.313464428 -15.8628500
#> 35 7 8854 111018 5  0.143805846   9.1947595
#> 36 1 8571  93247 6  0.088415855 -11.0088128
#> 37 2 8533  85105 6 -0.355331643 -18.7792102
#> 38 3 8553 114725 6 -0.121780328   9.4889267
#> 39 4 8561 122195 6 -0.028359802  16.6179943
#> 40 5 8532 100945 6 -0.367009209  -3.6621512
#> 41 6 8560 108552 6 -0.040037368   3.5976637
#> 42 7 8634 108707 6  0.824102496   3.7455895
#> 43 1 8646 117420 7 -0.816125860  14.4890796
#> 44 2 8633 113823 7 -0.965257293  10.9818643
#> 45 3 8680  82763 7 -0.426089807 -19.3028471
#> 46 4 8765 121072 7  0.549000328  18.0499220
#> 47 5 8756  89835 7  0.445755490 -12.4073713
#> 48 6 8750 104578 7  0.376925598   1.9676287
#> 49 7 8790  88429 7  0.835791544 -13.7782761

#' Created on 2018-11-27 by the [reprex package](http://reprex.tidyverse.org) (v0.2.0).
###############################

On Wed, 28 Nov 2018, Ogbos Okike wrote:

> Dear Jim,
>
> I don't think my problem is clear the way I put.
>
> I have been trying to manually apply the formula to some rows.
>
> This is what I have done.
> I cut and past some rows from 1-7 and save each with a different file as
> shown below:
>
> 1 8590 12516
> 2 8641 98143
> 3 8705 98916
> 4 8750 89911
> 5 8685 104835
> 6 8629 121963
> 7 8676 77655
>
>
> 1 8577 81081
> 2 8593 83385
> 3 8642 112164
> 4 8708 103684
> 5 8622 83982
> 6 8593 75944
> 7 8600 97036
>
>
> 1 8650 104911
> 2 8730 114098
> 3 8731 99421
> 4 8715 85707
> 5 8717 81273
> 6 8739 106462
> 7 8684 110635
>
>
> 1 8713 105214
> 2 8771 92456
> 3 8759 109270
> 4 8762 99150
> 5 8730 77306
> 6 8780 86324
> 7 8804 90214
>
>
> 1 8797 99894
> 2 8863 95177
> 3 8873 95910
> 4 8827 108511
> 5 8806 115636
> 6 8869 85542
> 7 8854 111018
>
>
> 1 8571 93247
> 2 8533 85105
> 3 8553 114725
> 4 8561 122195
> 5 8532 100945
> 6 8560 108552
> 7 8634 108707
>
>
> 1 8646 117420
> 2 8633 113823
> 3 8680 82763
> 4 8765 121072
> 5 8756 89835
> 6 8750 104578
> 7 8790 88429
>
> Each of them are then read as:
> d1<-read.table("dat1",col.names=c("n","CR","WW"))
> d2<-read.table("dat2",col.names=c("n","CR","WW"))
> d3<-read.table("dat3",col.names=c("n","CR","WW"))
> d4<-read.table("dat4",col.names=c("n","CR","WW"))
> d5<-read.table("dat5",col.names=c("n","CR","WW"))
> d6<-read.table("dat6",col.names=c("n","CR","WW"))
> d7<-read.table("dat7",col.names=c("n","CR","WW"))
>
> And my formula for percentage change applied as follows for column 2:
> a1<-((d1$CR-mean(d1$CR))/mean(CR))*100
> a2<-((d2$CR-mean(d2$CR))/mean(CR))*100
> a3<-((d3$CR-mean(d3$CR))/mean(CR))*100
> a4<-((d4$CR-mean(d4$CR))/mean(CR))*100
> a5<-((d5$CR-mean(d5$CR))/mean(CR))*100
> a6<-((d6$CR-mean(d6$CR))/mean(CR))*100
> a7<-((d7$CR-mean(d7$CR))/mean(CR))*100
>
> a1-a7 actually gives percentage change in the data.
>
> Instead of doing this one after the other, can you please give an
> indication on how I may apply this formula to the data frame with probably
> a code.
>
> Thank you again.
>
> Best
> Ogbos
>
> On Wed, Nov 28, 2018 at 5:15 AM Ogbos Okike <[hidden email]>
> wrote:
>
>> Dear Jim,
>>
>> I wish also to use the means calculated and apply a certain formula on
>> the  same data frame. In particular, I would like to subtract the means of
>> each of these seven days from each of the seven days and and divide the
>> outcome by the same means. If I represent m1 by the means of each seven
>> days in column 1, and c1 is taken as column 1 data. My formula will be of
>> the form:
>> aa<-(c1-m1)/m1.
>>
>> I tried it on the first 7 rows and I have what I am looking for.:
>>  -0.0089986156
>>   -0.0031149054
>>    0.0042685741
>>    0.0094600831
>>    0.0019612367
>>   -0.0044993078
>>    0.0009229349
>>
>> But doing it manually will take much time.
>>
>> Many thanks for going a step further to assist me.
>>
>> Warmest regards.
>> Ogbos
>>
>> On Wed, Nov 28, 2018 at 4:31 AM Jim Lemon <[hidden email]> wrote:
>>
>>> Hi Ogbos,
>>> If we assume that you have a 3 column data frame named oodf, how about:
>>>
>>> oodf[,4]<-floor((cumsum(oodf[,1])-1)/28)
>>> col2means<-by(oodf[,2],oodf[,4],mean)
>>> col3means<-by(oodf[,3],oodf[,4],mean)
>>>
>>> Jim
>>>
>>> On Wed, Nov 28, 2018 at 2:06 PM Ogbos Okike <[hidden email]>
>>> wrote:
>>>>
>>>> Dear List,
>>>> I have three data-column data. The data is of the form:
>>>> 1 8590 12516
>>>> 2 8641 98143
>>>> 3 8705 98916
>>>> 4 8750 89911
>>>> 5 8685 104835
>>>> 6 8629 121963
>>>> 7 8676 77655
>>>> 1 8577 81081
>>>> 2 8593 83385
>>>> 3 8642 112164
>>>> 4 8708 103684
>>>> 5 8622 83982
>>>> 6 8593 75944
>>>> 7 8600 97036
>>>> 1 8650 104911
>>>> 2 8730 114098
>>>> 3 8731 99421
>>>> 4 8715 85707
>>>> 5 8717 81273
>>>> 6 8739 106462
>>>> 7 8684 110635
>>>> 1 8713 105214
>>>> 2 8771 92456
>>>> 3 8759 109270
>>>> 4 8762 99150
>>>> 5 8730 77306
>>>> 6 8780 86324
>>>> 7 8804 90214
>>>> 1 8797 99894
>>>> 2 8863 95177
>>>> 3 8873 95910
>>>> 4 8827 108511
>>>> 5 8806 115636
>>>> 6 8869 85542
>>>> 7 8854 111018
>>>> 1 8571 93247
>>>> 2 8533 85105
>>>> 3 8553 114725
>>>> 4 8561 122195
>>>> 5 8532 100945
>>>> 6 8560 108552
>>>> 7 8634 108707
>>>> 1 8646 117420
>>>> 2 8633 113823
>>>> 3 8680 82763
>>>> 4 8765 121072
>>>> 5 8756 89835
>>>> 6 8750 104578
>>>> 7 8790 88429
>>>>
>>>> I wish to calculate average of the second and third columns based on the
>>>> first column for each repeated 7 days. The length of the data is 1442.
>>> That
>>>> is 206 by 7. So I should arrive at 207 data points for each of the two
>>>> columns after calculating the mean of each group 1-7.
>>>>
>>>> I have both tried factor/tapply and aggregate functions but seem not to
>>> be
>>>> making progress.
>>>>
>>>> Thank you very much for your idea.
>>>>
>>>> Best wishes
>>>> Ogbos
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

] Applying a certain formula to a repeated sample data: RESOLVED

Ogbos
Dear Jeff,

This is great to me!!! Many, many thanks.

I have also clicked the send plain text mode button and I hope this
message will appear in plain text mode.

Thanks again.

Warmest regards.
Ogbos

On Wed, Nov 28, 2018 at 7:06 AM Jeff Newmiller <[hidden email]> wrote:

>
> Thank you for providing a clarifying example. I think a useful function
> for you to get familiar with is the "ave" function. It is kind of like
> aggregate except that it works when the operation you want to apply to the
> group of elements will returns the same number of elements as were given
> to it.
>
> Also, in the future please figure out how to tell gmail to send plain text
> to the mailing list instead of HTML. You were lucky this time, but often
> HTML email gets horribly mangled as it goes through the mailing list and
> gets all the formatting removed.
>
> ###############################
> dta <- read.table( text =
> "n CR WW
> 1 8590 12516
> 2 8641 98143
> 3 8705 98916
> 4 8750 89911
> 5 8685 104835
> 6 8629 121963
> 7 8676 77655
> 1 8577 81081
> 2 8593 83385
> 3 8642 112164
> 4 8708 103684
> 5 8622 83982
> 6 8593 75944
> 7 8600 97036
> 1 8650 104911
> 2 8730 114098
> 3 8731 99421
> 4 8715 85707
> 5 8717 81273
> 6 8739 106462
> 7 8684 110635
> 1 8713 105214
> 2 8771 92456
> 3 8759 109270
> 4 8762 99150
> 5 8730 77306
> 6 8780 86324
> 7 8804 90214
> 1 8797 99894
> 2 8863 95177
> 3 8873 95910
> 4 8827 108511
> 5 8806 115636
> 6 8869 85542
> 7 8854 111018
> 1 8571 93247
> 2 8533 85105
> 3 8553 114725
> 4 8561 122195
> 5 8532 100945
> 6 8560 108552
> 7 8634 108707
> 1 8646 117420
> 2 8633 113823
> 3 8680 82763
> 4 8765 121072
> 5 8756 89835
> 6 8750 104578
> 7 8790 88429
> ",header=TRUE)
>
> # one way to make a grouping vector
> dta$G <- cumsum( c( 1, diff( dta$n ) < 0 )
> )
> # your operation
> fn <- function( x ) {
>    m <- mean( x )
>   ( x - m ) / m * 100
> }
> # your operation, computing for each group
> gn <- function( x, g ) {
>    ave( x, g, FUN = fn )
> }
> # do the computations
> dta$CRpct <- gn( dta$CR, dta$G )
> dta$WWpct <- gn( dta$WW, dta$G )
> dta
> #>    n   CR     WW G        CRpct       WWpct
> #> 1  1 8590  12516 1 -0.899861560 -85.4932369
> #> 2  2 8641  98143 1 -0.311490540  13.7533758
> #> 3  3 8705  98916 1  0.426857407  14.6493272
> #> 4  4 8750  89911 1  0.946008306   4.2120148
> #> 5  5 8685 104835 1  0.196123673  21.5097882
> #> 6  6 8629 121963 1 -0.449930780  41.3621243
> #> 7  7 8676  77655 1  0.092293493  -9.9933934
> #> 8  1 8577  81081 2 -0.490594182 -10.9385886
> #> 9  2 8593  83385 2 -0.304963951  -8.4078170
> #> 10 3 8642 112164 2  0.263528632  23.2037610
> #> 11 4 8708 103684 2  1.029253336  13.8891155
> #> 12 5 8622  83982 2  0.031490843  -7.7520572
> #> 13 6 8593  75944 2 -0.304963951 -16.5811987
> #> 14 7 8600  97036 2 -0.223750725   6.5867850
> #> 15 1 8650 104911 3 -0.682347538   4.5366096
> #> 16 2 8730 114098 3  0.236197225  13.6908244
> #> 17 3 8731  99421 3  0.247679034  -0.9337985
> #> 18 4 8715  85707 3  0.063970082 -14.5988581
> #> 19 5 8717  81273 3  0.086933701 -19.0170347
> #> 20 6 8739 106462 3  0.339533510   6.0820746
> #> 21 7 8684 110635 3 -0.291966014  10.2401827
> #> 22 1 8713 105214 4 -0.534907614  11.6017662
> #> 23 2 8771  92456 4  0.127203640  -1.9307991
> #> 24 3 8759 109270 4 -0.009784895  15.9040146
> #> 25 4 8762  99150 4  0.024462238   5.1696079
> #> 26 5 8730  77306 4 -0.340840523 -18.0005879
> #> 27 6 8780  86324 4  0.229945042  -8.4350859
> #> 28 7 8804  90214 4  0.503922112  -4.3089157
> #> 29 1 8797  99894 5 -0.500896767  -1.7465519
> #> 30 2 8863  95177 5  0.245600995  -6.3860849
> #> 31 3 8873  95910 5  0.358706717  -5.6651229
> #> 32 4 8827 108511 5 -0.161579602   6.7289318
> #> 33 5 8806 115636 5 -0.399101617  13.7369184
> #> 34 6 8869  85542 5  0.313464428 -15.8628500
> #> 35 7 8854 111018 5  0.143805846   9.1947595
> #> 36 1 8571  93247 6  0.088415855 -11.0088128
> #> 37 2 8533  85105 6 -0.355331643 -18.7792102
> #> 38 3 8553 114725 6 -0.121780328   9.4889267
> #> 39 4 8561 122195 6 -0.028359802  16.6179943
> #> 40 5 8532 100945 6 -0.367009209  -3.6621512
> #> 41 6 8560 108552 6 -0.040037368   3.5976637
> #> 42 7 8634 108707 6  0.824102496   3.7455895
> #> 43 1 8646 117420 7 -0.816125860  14.4890796
> #> 44 2 8633 113823 7 -0.965257293  10.9818643
> #> 45 3 8680  82763 7 -0.426089807 -19.3028471
> #> 46 4 8765 121072 7  0.549000328  18.0499220
> #> 47 5 8756  89835 7  0.445755490 -12.4073713
> #> 48 6 8750 104578 7  0.376925598   1.9676287
> #> 49 7 8790  88429 7  0.835791544 -13.7782761
>
> #' Created on 2018-11-27 by the [reprex package](http://reprex.tidyverse.org) (v0.2.0).
> ###############################
>
> On Wed, 28 Nov 2018, Ogbos Okike wrote:
>
> > Dear Jim,
> >
> > I don't think my problem is clear the way I put.
> >
> > I have been trying to manually apply the formula to some rows.
> >
> > This is what I have done.
> > I cut and past some rows from 1-7 and save each with a different file as
> > shown below:
> >
> > 1 8590 12516
> > 2 8641 98143
> > 3 8705 98916
> > 4 8750 89911
> > 5 8685 104835
> > 6 8629 121963
> > 7 8676 77655
> >
> >
> > 1 8577 81081
> > 2 8593 83385
> > 3 8642 112164
> > 4 8708 103684
> > 5 8622 83982
> > 6 8593 75944
> > 7 8600 97036
> >
> >
> > 1 8650 104911
> > 2 8730 114098
> > 3 8731 99421
> > 4 8715 85707
> > 5 8717 81273
> > 6 8739 106462
> > 7 8684 110635
> >
> >
> > 1 8713 105214
> > 2 8771 92456
> > 3 8759 109270
> > 4 8762 99150
> > 5 8730 77306
> > 6 8780 86324
> > 7 8804 90214
> >
> >
> > 1 8797 99894
> > 2 8863 95177
> > 3 8873 95910
> > 4 8827 108511
> > 5 8806 115636
> > 6 8869 85542
> > 7 8854 111018
> >
> >
> > 1 8571 93247
> > 2 8533 85105
> > 3 8553 114725
> > 4 8561 122195
> > 5 8532 100945
> > 6 8560 108552
> > 7 8634 108707
> >
> >
> > 1 8646 117420
> > 2 8633 113823
> > 3 8680 82763
> > 4 8765 121072
> > 5 8756 89835
> > 6 8750 104578
> > 7 8790 88429
> >
> > Each of them are then read as:
> > d1<-read.table("dat1",col.names=c("n","CR","WW"))
> > d2<-read.table("dat2",col.names=c("n","CR","WW"))
> > d3<-read.table("dat3",col.names=c("n","CR","WW"))
> > d4<-read.table("dat4",col.names=c("n","CR","WW"))
> > d5<-read.table("dat5",col.names=c("n","CR","WW"))
> > d6<-read.table("dat6",col.names=c("n","CR","WW"))
> > d7<-read.table("dat7",col.names=c("n","CR","WW"))
> >
> > And my formula for percentage change applied as follows for column 2:
> > a1<-((d1$CR-mean(d1$CR))/mean(CR))*100
> > a2<-((d2$CR-mean(d2$CR))/mean(CR))*100
> > a3<-((d3$CR-mean(d3$CR))/mean(CR))*100
> > a4<-((d4$CR-mean(d4$CR))/mean(CR))*100
> > a5<-((d5$CR-mean(d5$CR))/mean(CR))*100
> > a6<-((d6$CR-mean(d6$CR))/mean(CR))*100
> > a7<-((d7$CR-mean(d7$CR))/mean(CR))*100
> >
> > a1-a7 actually gives percentage change in the data.
> >
> > Instead of doing this one after the other, can you please give an
> > indication on how I may apply this formula to the data frame with probably
> > a code.
> >
> > Thank you again.
> >
> > Best
> > Ogbos
> >
> > On Wed, Nov 28, 2018 at 5:15 AM Ogbos Okike <[hidden email]>
> > wrote:
> >
> >> Dear Jim,
> >>
> >> I wish also to use the means calculated and apply a certain formula on
> >> the  same data frame. In particular, I would like to subtract the means of
> >> each of these seven days from each of the seven days and and divide the
> >> outcome by the same means. If I represent m1 by the means of each seven
> >> days in column 1, and c1 is taken as column 1 data. My formula will be of
> >> the form:
> >> aa<-(c1-m1)/m1.
> >>
> >> I tried it on the first 7 rows and I have what I am looking for.:
> >>  -0.0089986156
> >>   -0.0031149054
> >>    0.0042685741
> >>    0.0094600831
> >>    0.0019612367
> >>   -0.0044993078
> >>    0.0009229349
> >>
> >> But doing it manually will take much time.
> >>
> >> Many thanks for going a step further to assist me.
> >>
> >> Warmest regards.
> >> Ogbos
> >>
> >> On Wed, Nov 28, 2018 at 4:31 AM Jim Lemon <[hidden email]> wrote:
> >>
> >>> Hi Ogbos,
> >>> If we assume that you have a 3 column data frame named oodf, how about:
> >>>
> >>> oodf[,4]<-floor((cumsum(oodf[,1])-1)/28)
> >>> col2means<-by(oodf[,2],oodf[,4],mean)
> >>> col3means<-by(oodf[,3],oodf[,4],mean)
> >>>
> >>> Jim
> >>>
> >>> On Wed, Nov 28, 2018 at 2:06 PM Ogbos Okike <[hidden email]>
> >>> wrote:
> >>>>
> >>>> Dear List,
> >>>> I have three data-column data. The data is of the form:
> >>>> 1 8590 12516
> >>>> 2 8641 98143
> >>>> 3 8705 98916
> >>>> 4 8750 89911
> >>>> 5 8685 104835
> >>>> 6 8629 121963
> >>>> 7 8676 77655
> >>>> 1 8577 81081
> >>>> 2 8593 83385
> >>>> 3 8642 112164
> >>>> 4 8708 103684
> >>>> 5 8622 83982
> >>>> 6 8593 75944
> >>>> 7 8600 97036
> >>>> 1 8650 104911
> >>>> 2 8730 114098
> >>>> 3 8731 99421
> >>>> 4 8715 85707
> >>>> 5 8717 81273
> >>>> 6 8739 106462
> >>>> 7 8684 110635
> >>>> 1 8713 105214
> >>>> 2 8771 92456
> >>>> 3 8759 109270
> >>>> 4 8762 99150
> >>>> 5 8730 77306
> >>>> 6 8780 86324
> >>>> 7 8804 90214
> >>>> 1 8797 99894
> >>>> 2 8863 95177
> >>>> 3 8873 95910
> >>>> 4 8827 108511
> >>>> 5 8806 115636
> >>>> 6 8869 85542
> >>>> 7 8854 111018
> >>>> 1 8571 93247
> >>>> 2 8533 85105
> >>>> 3 8553 114725
> >>>> 4 8561 122195
> >>>> 5 8532 100945
> >>>> 6 8560 108552
> >>>> 7 8634 108707
> >>>> 1 8646 117420
> >>>> 2 8633 113823
> >>>> 3 8680 82763
> >>>> 4 8765 121072
> >>>> 5 8756 89835
> >>>> 6 8750 104578
> >>>> 7 8790 88429
> >>>>
> >>>> I wish to calculate average of the second and third columns based on the
> >>>> first column for each repeated 7 days. The length of the data is 1442.
> >>> That
> >>>> is 206 by 7. So I should arrive at 207 data points for each of the two
> >>>> columns after calculating the mean of each group 1-7.
> >>>>
> >>>> I have both tried factor/tapply and aggregate functions but seem not to
> >>> be
> >>>> making progress.
> >>>>
> >>>> Thank you very much for your idea.
> >>>>
> >>>> Best wishes
> >>>> Ogbos
> >>>>
> >>>>         [[alternative HTML version deleted]]
> >>>>
> >>>> ______________________________________________
> >>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
>                                        Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.