Arrange data

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Arrange data

Md. Moyazzem Hossain
Hi,

I have a dataset having monthly observations (from January to December)
over a period of time like (2000 to 2018). Now, I am trying to take an
average the value from January to July of each year.

The data looks like
Year    Month  Value
2000    1         25
2000    2         28
2000    3         22
....    ......      .....
2000    12       26
2001     1       27
.......         ........
2018    11       30
20118   12      29

Can someone help me in this regard?

Many thanks in advance.

*Regards,*
*Md*

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Arrange data

Jim Lemon-4
Hi Md,
One way is to form a subset of your data, then calculate the means by year:

# assume your data is named mddat
mddat2<-mddat[mddat$month < 7,]
jan2jun<-by(mddat2$value,mddat2$year,mean)

Jim

On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain <[hidden email]> wrote:

>
> Hi,
>
> I have a dataset having monthly observations (from January to December)
> over a period of time like (2000 to 2018). Now, I am trying to take an
> average the value from January to July of each year.
>
> The data looks like
> Year    Month  Value
> 2000    1         25
> 2000    2         28
> 2000    3         22
> ....    ......      .....
> 2000    12       26
> 2001     1       27
> .......         ........
> 2018    11       30
> 20118   12      29
>
> Can someone help me in this regard?
>
> Many thanks in advance.
>
> *Regards,*
> *Md*
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Arrange data

Rasmus Liland-3
On 2020-08-03 21:11 +1000, Jim Lemon wrote:

> On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain <[hidden email]> wrote:
> >
> > Hi,
> >
> > I have a dataset having monthly
> > observations (from January to
> > December) over a period of time like
> > (2000 to 2018). Now, I am trying to
> > take an average the value from
> > January to July of each year.
> >
> > The data looks like
> > Year    Month  Value
> > 2000    1         25
> > 2000    2         28
> > 2000    3         22
> > ....    ......      .....
> > 2000    12       26
> > 2001     1       27
> > .......         ........
> > 2018    11       30
> > 20118   12      29
> >
> > Can someone help me in this regard?
> >
> > Many thanks in advance.
>
> Hi Md,
> One way is to form a subset of your
> data, then calculate the means by
> year:
>
> # assume your data is named mddat
> mddat2<-mddat[mddat$month < 7,]
> jan2jun<-by(mddat2$value,mddat2$year,mean)
>
> Jim

Hi Md,

you can also define the period in a new
column, and use aggregate like this:

        Md <- structure(list(
        Year = c(2000L, 2000L, 2000L,
        2000L, 2001L, 2018L, 2018L),
        Month = c(1L, 2L, 3L, 12L, 1L,
        11L, 12L),
        Value = c(25L, 28L, 22L, 26L,
        27L, 30L, 29L)),
        class = "data.frame",
        row.names = c(NA, -7L))
       
        Md[Md$Month %in%
                1:6,"Period"] <- "first six months of the year"
        Md[Md$Month %in% 7:12,"Period"] <- "last six months of the year"
       
        aggregate(
          formula=Value~Year+Period,
          data=Md,
          FUN=mean)

Rasmus

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Arrange data

Rui Barradas
Hello,

And here is another way, with aggregate.

Make up test data.

set.seed(2020)
df1 <- expand.grid(Year = 2000:2018, Month = 1:12)
df1 <- df1[order(df1$Year),]
df1$Value <- sample(20:30, nrow(df1), TRUE)
head(df1)


#Use subset to keep only the relevant months
aggregate(Value ~ Year, data = subset(df1, Month <= 7), FUN = mean)


Hope this helps,

Rui Barradas

Às 12:33 de 03/08/2020, Rasmus Liland escreveu:

> On 2020-08-03 21:11 +1000, Jim Lemon wrote:
>> On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain <[hidden email]> wrote:
>>> Hi,
>>>
>>> I have a dataset having monthly
>>> observations (from January to
>>> December) over a period of time like
>>> (2000 to 2018). Now, I am trying to
>>> take an average the value from
>>> January to July of each year.
>>>
>>> The data looks like
>>> Year    Month  Value
>>> 2000    1         25
>>> 2000    2         28
>>> 2000    3         22
>>> ....    ......      .....
>>> 2000    12       26
>>> 2001     1       27
>>> .......         ........
>>> 2018    11       30
>>> 20118   12      29
>>>
>>> Can someone help me in this regard?
>>>
>>> Many thanks in advance.
>> Hi Md,
>> One way is to form a subset of your
>> data, then calculate the means by
>> year:
>>
>> # assume your data is named mddat
>> mddat2<-mddat[mddat$month < 7,]
>> jan2jun<-by(mddat2$value,mddat2$year,mean)
>>
>> Jim
> Hi Md,
>
> you can also define the period in a new
> column, and use aggregate like this:
>
> Md <- structure(list(
> Year = c(2000L, 2000L, 2000L,
> 2000L, 2001L, 2018L, 2018L),
> Month = c(1L, 2L, 3L, 12L, 1L,
> 11L, 12L),
> Value = c(25L, 28L, 22L, 26L,
> 27L, 30L, 29L)),
> class = "data.frame",
> row.names = c(NA, -7L))
>
> Md[Md$Month %in%
>        1:6,"Period"] <- "first six months of the year"
> Md[Md$Month %in% 7:12,"Period"] <- "last six months of the year"
>
> aggregate(
>  formula=Value~Year+Period,
>  data=Md,
>  FUN=mean)
>
> Rasmus
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


--
Este e-mail foi verificado em termos de vírus pelo software antivírus Avast.
https://www.avast.com/antivirus

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Arrange data

Jim Lemon-4
In reply to this post by Rasmus Liland-3
Your problem is in the subset operation. You have asked for a value of
month greater or equal to 7 and less than or equal to 6. You probably
got an error message that told you that the data were of length zero
or something similar. If you check the result of that statement:

> mddat$month >= 7 & mddat$month <= 6
logical(0)

In other words, the two logical statements when ANDed cannot produce a
result. A number cannot be greater than or equal to 7 AND less than or
equal to 6. What you want is:

mddat2<-mddat[mddat$Year == 1975 & mddat$Month >= 7 |
 mddat$Year == 1976 & mddat$Month <= 6,]
mean(mddat2$Value)
[1] 88.91667

Apart from that, your email client is inserting EOL characters that
cause an error when pasted into R.

Error: unexpected input in "�"

Probably due to MS Outlook, this has been happening quite a bit lately.

Jim

On Mon, Aug 3, 2020 at 11:30 PM Md. Moyazzem Hossain
<[hidden email]> wrote:

>
> Dear Jim,
>
> Thank you very much. It is working now.
>
> However, I am also trying to find the average of the value from July 1975 to June 1976 and recorded as the value for the year 1975 but got an error message. I am attaching the data file here. Please check the attachment.
>
> mddat=read.csv("F:/mddat.csv", header=TRUE)
> mddat2<-mddat[mddat$Month >=7 & mddat$Month <= 6,]
> jan2jun<-by(mddat2$Value,mddat2$Year,mean)
> jan2jun
>
> Please help me again and many thanks in advance.
>
> Md
>
>
> On Mon, Aug 3, 2020 at 12:33 PM Rasmus Liland <[hidden email]> wrote:
>>
>> On 2020-08-03 21:11 +1000, Jim Lemon wrote:
>> > On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain <[hidden email]> wrote:
>> > >
>> > > Hi,
>> > >
>> > > I have a dataset having monthly
>> > > observations (from January to
>> > > December) over a period of time like
>> > > (2000 to 2018). Now, I am trying to
>> > > take an average the value from
>> > > January to July of each year.
>> > >
>> > > The data looks like
>> > > Year    Month  Value
>> > > 2000    1         25
>> > > 2000    2         28
>> > > 2000    3         22
>> > > ....    ......      .....
>> > > 2000    12       26
>> > > 2001     1       27
>> > > .......         ........
>> > > 2018    11       30
>> > > 20118   12      29
>> > >
>> > > Can someone help me in this regard?
>> > >
>> > > Many thanks in advance.
>> >
>> > Hi Md,
>> > One way is to form a subset of your
>> > data, then calculate the means by
>> > year:
>> >
>> > # assume your data is named mddat
>> > mddat2<-mddat[mddat$month < 7,]
>> > jan2jun<-by(mddat2$value,mddat2$year,mean)
>> >
>> > Jim
>>
>> Hi Md,
>>
>> you can also define the period in a new
>> column, and use aggregate like this:
>>
>>         Md <- structure(list(
>>         Year = c(2000L, 2000L, 2000L,
>>         2000L, 2001L, 2018L, 2018L),
>>         Month = c(1L, 2L, 3L, 12L, 1L,
>>         11L, 12L),
>>         Value = c(25L, 28L, 22L, 26L,
>>         27L, 30L, 29L)),
>>         class = "data.frame",
>>         row.names = c(NA, -7L))
>>
>>         Md[Md$Month %in%
>>                 1:6,"Period"] <- "first six months of the year"
>>         Md[Md$Month %in% 7:12,"Period"] <- "last six months of the year"
>>
>>         aggregate(
>>           formula=Value~Year+Period,
>>           data=Md,
>>           FUN=mean)
>>
>> Rasmus
>
>
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Arrange data

Rui Barradas
In reply to this post by Rui Barradas
Hello,

Please keep cc-ing the list R-help is threaded and questions and answers
might be of help to others in the future.

As for the question, see if the following code does what you want.
First, create a logical index i of the months between 7 and 3 and use
that index to subset the original data.frame. Then, a cumsum trick gives
a vector M defining the data grouping. Group and compute the Value means
with aggregate. Finally, since each group spans a year border, create a
more meaningful Years column and put everything together.

df1 <- read.csv("mddat.csv")

i <- with(df1, (Month >= 7 & Month <= 12) | (Month >= 1 & Month <= 3))
df2 <- df1[i, ]
M <- cumsum(c(FALSE, diff(as.integer(row.names(df2))) > 1))

agg <- aggregate(Value ~ M, df2, mean)
Years <- sapply(split(df2$Year, M), function(x){paste(x[1],
x[length(x)], sep = "-")})
final <- cbind.data.frame(Years, Value = agg[["Value"]])

head(final)
#      Years    Value
#0 1975-1975 87.00000
#1 1975-1976 89.44444
#2 1976-1977 85.77778
#3 1977-1978 81.55556
#4 1978-1979 71.55556
#5 1979-1980 75.77778


Hope this helps,

Rui Barradas



Às 20:44 de 04/08/20, Md. Moyazzem Hossain escreveu:

> Dear Rui,
>
> Thanks a lot for your help.
>
> It is working. Now I am also trying to find the average of values for
> *July 1975 to March 1976* and record as the value of the year 1975.
> Moreover, I want to continue it up to the year 2017. You may check the
> attached file for data (mddat.csv).
>
> I use the following function but got error
> aggregate(Value ~ Year, data = subset(df1, Month >= 7 & Month <= 3), FUN
> = mean)
>
> Please help me again. Thanks in advance.
>
> Best Regards,
> Md
>
> On Mon, Aug 3, 2020 at 11:28 PM Rui Barradas <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hello,
>
>     And here is another way, with aggregate.
>
>     Make up test data.
>
>     set.seed(2020)
>     df1 <- expand.grid(Year = 2000:2018, Month = 1:12)
>     df1 <- df1[order(df1$Year),]
>     df1$Value <- sample(20:30, nrow(df1), TRUE)
>     head(df1)
>
>
>     #Use subset to keep only the relevant months
>     aggregate(Value ~ Year, data = subset(df1, Month <= 7), FUN = mean)
>
>
>     Hope this helps,
>
>     Rui Barradas
>
>     Às 12:33 de 03/08/2020, Rasmus Liland escreveu:
>      > On 2020-08-03 21:11 +1000, Jim Lemon wrote:
>      >> On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      >>> Hi,
>      >>>
>      >>> I have a dataset having monthly
>      >>> observations (from January to
>      >>> December) over a period of time like
>      >>> (2000 to 2018). Now, I am trying to
>      >>> take an average the value from
>      >>> January to July of each year.
>      >>>
>      >>> The data looks like
>      >>> Year    Month  Value
>      >>> 2000    1         25
>      >>> 2000    2         28
>      >>> 2000    3         22
>      >>> ....    ......      .....
>      >>> 2000    12       26
>      >>> 2001     1       27
>      >>> .......         ........
>      >>> 2018    11       30
>      >>> 20118   12      29
>      >>>
>      >>> Can someone help me in this regard?
>      >>>
>      >>> Many thanks in advance.
>      >> Hi Md,
>      >> One way is to form a subset of your
>      >> data, then calculate the means by
>      >> year:
>      >>
>      >> # assume your data is named mddat
>      >> mddat2<-mddat[mddat$month < 7,]
>      >> jan2jun<-by(mddat2$value,mddat2$year,mean)
>      >>
>      >> Jim
>      > Hi Md,
>      >
>      > you can also define the period in a new
>      > column, and use aggregate like this:
>      >
>      >       Md <- structure(list(
>      >       Year = c(2000L, 2000L, 2000L,
>      >       2000L, 2001L, 2018L, 2018L),
>      >       Month = c(1L, 2L, 3L, 12L, 1L,
>      >       11L, 12L),
>      >       Value = c(25L, 28L, 22L, 26L,
>      >       27L, 30L, 29L)),
>      >       class = "data.frame",
>      >       row.names = c(NA, -7L))
>      >
>      >       Md[Md$Month %in%
>      >               1:6,"Period"] <- "first six months of the year"
>      >       Md[Md$Month %in% 7:12,"Period"] <- "last six months of the
>     year"
>      >
>      >       aggregate(
>      >         formula=Value~Year+Period,
>      >         data=Md,
>      >         FUN=mean)
>      >
>      > Rasmus
>      >
>      > ______________________________________________
>      > [hidden email] <mailto:[hidden email]> mailing list
>     -- To UNSUBSCRIBE and more, see
>      > https://stat.ethz.ch/mailman/listinfo/r-help
>      > PLEASE do read the posting guide
>     http://www.R-project.org/posting-guide.html
>      > and provide commented, minimal, self-contained, reproducible code.
>
>
>     --
>     Este e-mail foi verificado em termos de vírus pelo software
>     antivírus Avast.
>     https://www.avast.com/antivirus
>
>     ______________________________________________
>     [hidden email] <mailto:[hidden email]> mailing list --
>     To UNSUBSCRIBE and more, see
>     https://stat.ethz.ch/mailman/listinfo/r-help
>     PLEASE do read the posting guide
>     http://www.R-project.org/posting-guide.html
>     and provide commented, minimal, self-contained, reproducible code.
>
>
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Arrange data

Jim Lemon-4
In reply to this post by Jim Lemon-4
Hi Md,
I think the errors are that you forgot to initialize "m", calculated
the mean outside the loops and forgot the final brace:

m<-rep(0,44)
for(i in 1975:2017) {
  for(j in 1:44) {
   mddat2[j]<-mddat[mddat$Year == i & mddat$Month >= 7 |
      mddat$Year == (i+1) & mddat$Month <= 6,]
   m[j]=mean(mddat2$Value)
 }
}

Jim

On Wed, Aug 5, 2020 at 6:04 AM Md. Moyazzem Hossain <[hidden email]> wrote:

>
> Dear Jim,
>
> Thank you very much. You are right. It is good now. However, I want to continue it up to the year 2017.
>
> I use the following code but got the error
>
> for(i in 1975:2017){
>   for(j in 1:44){
> mddat2[j]<-mddat[mddat$Year == i & mddat$Month >= 7 |
>                 mddat$Year == (i+1) & mddat$Month <= 6,]
> }
> m[j]=mean(mddat2$Value)
>
> }
> m
>
> Please help me in this regard. Many thanks in advance.
>
> Regards,
> Md
>
> On Tue, Aug 4, 2020 at 8:41 AM Jim Lemon <[hidden email]> wrote:
>>
>> Your problem is in the subset operation. You have asked for a value of
>> month greater or equal to 7 and less than or equal to 6. You probably
>> got an error message that told you that the data were of length zero
>> or something similar. If you check the result of that statement:
>>
>> > mddat$month >= 7 & mddat$month <= 6
>> logical(0)
>>
>> In other words, the two logical statements when ANDed cannot produce a
>> result. A number cannot be greater than or equal to 7 AND less than or
>> equal to 6. What you want is:
>>
>> mddat2<-mddat[mddat$Year == 1975 & mddat$Month >= 7 |
>>  mddat$Year == 1976 & mddat$Month <= 6,]
>> mean(mddat2$Value)
>> [1] 88.91667
>>
>> Apart from that, your email client is inserting EOL characters that
>> cause an error when pasted into R.
>>
>> Error: unexpected input in "�"
>>
>> Probably due to MS Outlook, this has been happening quite a bit lately.
>>
>> Jim
>>
>> On Mon, Aug 3, 2020 at 11:30 PM Md. Moyazzem Hossain
>> <[hidden email]> wrote:
>> >
>> > Dear Jim,
>> >
>> > Thank you very much. It is working now.
>> >
>> > However, I am also trying to find the average of the value from July 1975 to June 1976 and recorded as the value for the year 1975 but got an error message. I am attaching the data file here. Please check the attachment.
>> >
>> > mddat=read.csv("F:/mddat.csv", header=TRUE)
>> > mddat2<-mddat[mddat$Month >=7 & mddat$Month <= 6,]
>> > jan2jun<-by(mddat2$Value,mddat2$Year,mean)
>> > jan2jun
>> >
>> > Please help me again and many thanks in advance.
>> >
>> > Md
>> >
>> >
>> > On Mon, Aug 3, 2020 at 12:33 PM Rasmus Liland <[hidden email]> wrote:
>> >>
>> >> On 2020-08-03 21:11 +1000, Jim Lemon wrote:
>> >> > On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain <[hidden email]> wrote:
>> >> > >
>> >> > > Hi,
>> >> > >
>> >> > > I have a dataset having monthly
>> >> > > observations (from January to
>> >> > > December) over a period of time like
>> >> > > (2000 to 2018). Now, I am trying to
>> >> > > take an average the value from
>> >> > > January to July of each year.
>> >> > >
>> >> > > The data looks like
>> >> > > Year    Month  Value
>> >> > > 2000    1         25
>> >> > > 2000    2         28
>> >> > > 2000    3         22
>> >> > > ....    ......      .....
>> >> > > 2000    12       26
>> >> > > 2001     1       27
>> >> > > .......         ........
>> >> > > 2018    11       30
>> >> > > 20118   12      29
>> >> > >
>> >> > > Can someone help me in this regard?
>> >> > >
>> >> > > Many thanks in advance.
>> >> >
>> >> > Hi Md,
>> >> > One way is to form a subset of your
>> >> > data, then calculate the means by
>> >> > year:
>> >> >
>> >> > # assume your data is named mddat
>> >> > mddat2<-mddat[mddat$month < 7,]
>> >> > jan2jun<-by(mddat2$value,mddat2$year,mean)
>> >> >
>> >> > Jim
>> >>
>> >> Hi Md,
>> >>
>> >> you can also define the period in a new
>> >> column, and use aggregate like this:
>> >>
>> >>         Md <- structure(list(
>> >>         Year = c(2000L, 2000L, 2000L,
>> >>         2000L, 2001L, 2018L, 2018L),
>> >>         Month = c(1L, 2L, 3L, 12L, 1L,
>> >>         11L, 12L),
>> >>         Value = c(25L, 28L, 22L, 26L,
>> >>         27L, 30L, 29L)),
>> >>         class = "data.frame",
>> >>         row.names = c(NA, -7L))
>> >>
>> >>         Md[Md$Month %in%
>> >>                 1:6,"Period"] <- "first six months of the year"
>> >>         Md[Md$Month %in% 7:12,"Period"] <- "last six months of the year"
>> >>
>> >>         aggregate(
>> >>           formula=Value~Year+Period,
>> >>           data=Md,
>> >>           FUN=mean)
>> >>
>> >> Rasmus
>> >
>> >
>> >
>
>
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Arrange data

Md. Moyazzem Hossain
In reply to this post by Rui Barradas
Dear Rui,

Thank you for your nice help.

Take care and be safe.

Md

On Tue, Aug 4, 2020 at 10:45 PM Rui Barradas <[hidden email]> wrote:

> Hello,
>
> Please keep cc-ing the list R-help is threaded and questions and answers
> might be of help to others in the future.
>
> As for the question, see if the following code does what you want.
> First, create a logical index i of the months between 7 and 3 and use
> that index to subset the original data.frame. Then, a cumsum trick gives
> a vector M defining the data grouping. Group and compute the Value means
> with aggregate. Finally, since each group spans a year border, create a
> more meaningful Years column and put everything together.
>
> df1 <- read.csv("mddat.csv")
>
> i <- with(df1, (Month >= 7 & Month <= 12) | (Month >= 1 & Month <= 3))
> df2 <- df1[i, ]
> M <- cumsum(c(FALSE, diff(as.integer(row.names(df2))) > 1))
>
> agg <- aggregate(Value ~ M, df2, mean)
> Years <- sapply(split(df2$Year, M), function(x){paste(x[1],
> x[length(x)], sep = "-")})
> final <- cbind.data.frame(Years, Value = agg[["Value"]])
>
> head(final)
> #      Years    Value
> #0 1975-1975 87.00000
> #1 1975-1976 89.44444
> #2 1976-1977 85.77778
> #3 1977-1978 81.55556
> #4 1978-1979 71.55556
> #5 1979-1980 75.77778
>
>
> Hope this helps,
>
> Rui Barradas
>
>
>
> Às 20:44 de 04/08/20, Md. Moyazzem Hossain escreveu:
> > Dear Rui,
> >
> > Thanks a lot for your help.
> >
> > It is working. Now I am also trying to find the average of values for
> > *July 1975 to March 1976* and record as the value of the year 1975.
> > Moreover, I want to continue it up to the year 2017. You may check the
> > attached file for data (mddat.csv).
> >
> > I use the following function but got error
> > aggregate(Value ~ Year, data = subset(df1, Month >= 7 & Month <= 3), FUN
> > = mean)
> >
> > Please help me again. Thanks in advance.
> >
> > Best Regards,
> > Md
> >
> > On Mon, Aug 3, 2020 at 11:28 PM Rui Barradas <[hidden email]
> > <mailto:[hidden email]>> wrote:
> >
> >     Hello,
> >
> >     And here is another way, with aggregate.
> >
> >     Make up test data.
> >
> >     set.seed(2020)
> >     df1 <- expand.grid(Year = 2000:2018, Month = 1:12)
> >     df1 <- df1[order(df1$Year),]
> >     df1$Value <- sample(20:30, nrow(df1), TRUE)
> >     head(df1)
> >
> >
> >     #Use subset to keep only the relevant months
> >     aggregate(Value ~ Year, data = subset(df1, Month <= 7), FUN = mean)
> >
> >
> >     Hope this helps,
> >
> >     Rui Barradas
> >
> >     Às 12:33 de 03/08/2020, Rasmus Liland escreveu:
> >      > On 2020-08-03 21:11 +1000, Jim Lemon wrote:
> >      >> On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain
> >     <[hidden email] <mailto:[hidden email]>> wrote:
> >      >>> Hi,
> >      >>>
> >      >>> I have a dataset having monthly
> >      >>> observations (from January to
> >      >>> December) over a period of time like
> >      >>> (2000 to 2018). Now, I am trying to
> >      >>> take an average the value from
> >      >>> January to July of each year.
> >      >>>
> >      >>> The data looks like
> >      >>> Year    Month  Value
> >      >>> 2000    1         25
> >      >>> 2000    2         28
> >      >>> 2000    3         22
> >      >>> ....    ......      .....
> >      >>> 2000    12       26
> >      >>> 2001     1       27
> >      >>> .......         ........
> >      >>> 2018    11       30
> >      >>> 20118   12      29
> >      >>>
> >      >>> Can someone help me in this regard?
> >      >>>
> >      >>> Many thanks in advance.
> >      >> Hi Md,
> >      >> One way is to form a subset of your
> >      >> data, then calculate the means by
> >      >> year:
> >      >>
> >      >> # assume your data is named mddat
> >      >> mddat2<-mddat[mddat$month < 7,]
> >      >> jan2jun<-by(mddat2$value,mddat2$year,mean)
> >      >>
> >      >> Jim
> >      > Hi Md,
> >      >
> >      > you can also define the period in a new
> >      > column, and use aggregate like this:
> >      >
> >      >       Md <- structure(list(
> >      >       Year = c(2000L, 2000L, 2000L,
> >      >       2000L, 2001L, 2018L, 2018L),
> >      >       Month = c(1L, 2L, 3L, 12L, 1L,
> >      >       11L, 12L),
> >      >       Value = c(25L, 28L, 22L, 26L,
> >      >       27L, 30L, 29L)),
> >      >       class = "data.frame",
> >      >       row.names = c(NA, -7L))
> >      >
> >      >       Md[Md$Month %in%
> >      >               1:6,"Period"] <- "first six months of the year"
> >      >       Md[Md$Month %in% 7:12,"Period"] <- "last six months of the
> >     year"
> >      >
> >      >       aggregate(
> >      >         formula=Value~Year+Period,
> >      >         data=Md,
> >      >         FUN=mean)
> >      >
> >      > Rasmus
> >      >
> >      > ______________________________________________
> >      > [hidden email] <mailto:[hidden email]> mailing list
> >     -- To UNSUBSCRIBE and more, see
> >      > https://stat.ethz.ch/mailman/listinfo/r-help
> >      > PLEASE do read the posting guide
> >     http://www.R-project.org/posting-guide.html
> >      > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >     --
> >     Este e-mail foi verificado em termos de vírus pelo software
> >     antivírus Avast.
> >     https://www.avast.com/antivirus
> >
> >     ______________________________________________
> >     [hidden email] <mailto:[hidden email]> mailing list --
> >     To UNSUBSCRIBE and more, see
> >     https://stat.ethz.ch/mailman/listinfo/r-help
> >     PLEASE do read the posting guide
> >     http://www.R-project.org/posting-guide.html
> >     and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
>


--
Best Regards,
Md. Moyazzem Hossain
Associate Professor
Department of Statistics
Jahangirnagar University
Savar, Dhaka-1342
Bangladesh
Website: http://www.juniv.edu/teachers/hossainmm
Research: *Google Scholar
<https://scholar.google.com/citations?user=-U03XCgAAAAJ&hl=en&oi=ao>*;
*ResearchGate
<https://www.researchgate.net/profile/Md_Hossain107>*; *ORCID iD
<https://orcid.org/0000-0003-3593-6936>*

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Arrange data

Md. Moyazzem Hossain
In reply to this post by Jim Lemon-4
Dear Jim,

Thanks a lot for your support.

Take care.

Md

On Wed, Aug 5, 2020 at 1:06 PM Jim Lemon <[hidden email]> wrote:

> Hi Md,
> I think the errors are that you forgot to initialize "m", calculated
> the mean outside the loops and forgot the final brace:
>
> m<-rep(0,44)
> for(i in 1975:2017) {
>   for(j in 1:44) {
>    mddat2[j]<-mddat[mddat$Year == i & mddat$Month >= 7 |
>       mddat$Year == (i+1) & mddat$Month <= 6,]
>    m[j]=mean(mddat2$Value)
>  }
> }
>
> Jim
>
> On Wed, Aug 5, 2020 at 6:04 AM Md. Moyazzem Hossain <[hidden email]>
> wrote:
> >
> > Dear Jim,
> >
> > Thank you very much. You are right. It is good now. However, I want to
> continue it up to the year 2017.
> >
> > I use the following code but got the error
> >
> > for(i in 1975:2017){
> >   for(j in 1:44){
> > mddat2[j]<-mddat[mddat$Year == i & mddat$Month >= 7 |
> >                 mddat$Year == (i+1) & mddat$Month <= 6,]
> > }
> > m[j]=mean(mddat2$Value)
> >
> > }
> > m
> >
> > Please help me in this regard. Many thanks in advance.
> >
> > Regards,
> > Md
> >
> > On Tue, Aug 4, 2020 at 8:41 AM Jim Lemon <[hidden email]> wrote:
> >>
> >> Your problem is in the subset operation. You have asked for a value of
> >> month greater or equal to 7 and less than or equal to 6. You probably
> >> got an error message that told you that the data were of length zero
> >> or something similar. If you check the result of that statement:
> >>
> >> > mddat$month >= 7 & mddat$month <= 6
> >> logical(0)
> >>
> >> In other words, the two logical statements when ANDed cannot produce a
> >> result. A number cannot be greater than or equal to 7 AND less than or
> >> equal to 6. What you want is:
> >>
> >> mddat2<-mddat[mddat$Year == 1975 & mddat$Month >= 7 |
> >>  mddat$Year == 1976 & mddat$Month <= 6,]
> >> mean(mddat2$Value)
> >> [1] 88.91667
> >>
> >> Apart from that, your email client is inserting EOL characters that
> >> cause an error when pasted into R.
> >>
> >> Error: unexpected input in "�"
> >>
> >> Probably due to MS Outlook, this has been happening quite a bit lately.
> >>
> >> Jim
> >>
> >> On Mon, Aug 3, 2020 at 11:30 PM Md. Moyazzem Hossain
> >> <[hidden email]> wrote:
> >> >
> >> > Dear Jim,
> >> >
> >> > Thank you very much. It is working now.
> >> >
> >> > However, I am also trying to find the average of the value from July
> 1975 to June 1976 and recorded as the value for the year 1975 but got an
> error message. I am attaching the data file here. Please check the
> attachment.
> >> >
> >> > mddat=read.csv("F:/mddat.csv", header=TRUE)
> >> > mddat2<-mddat[mddat$Month >=7 & mddat$Month <= 6,]
> >> > jan2jun<-by(mddat2$Value,mddat2$Year,mean)
> >> > jan2jun
> >> >
> >> > Please help me again and many thanks in advance.
> >> >
> >> > Md
> >> >
> >> >
> >> > On Mon, Aug 3, 2020 at 12:33 PM Rasmus Liland <[hidden email]> wrote:
> >> >>
> >> >> On 2020-08-03 21:11 +1000, Jim Lemon wrote:
> >> >> > On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain <
> [hidden email]> wrote:
> >> >> > >
> >> >> > > Hi,
> >> >> > >
> >> >> > > I have a dataset having monthly
> >> >> > > observations (from January to
> >> >> > > December) over a period of time like
> >> >> > > (2000 to 2018). Now, I am trying to
> >> >> > > take an average the value from
> >> >> > > January to July of each year.
> >> >> > >
> >> >> > > The data looks like
> >> >> > > Year    Month  Value
> >> >> > > 2000    1         25
> >> >> > > 2000    2         28
> >> >> > > 2000    3         22
> >> >> > > ....    ......      .....
> >> >> > > 2000    12       26
> >> >> > > 2001     1       27
> >> >> > > .......         ........
> >> >> > > 2018    11       30
> >> >> > > 20118   12      29
> >> >> > >
> >> >> > > Can someone help me in this regard?
> >> >> > >
> >> >> > > Many thanks in advance.
> >> >> >
> >> >> > Hi Md,
> >> >> > One way is to form a subset of your
> >> >> > data, then calculate the means by
> >> >> > year:
> >> >> >
> >> >> > # assume your data is named mddat
> >> >> > mddat2<-mddat[mddat$month < 7,]
> >> >> > jan2jun<-by(mddat2$value,mddat2$year,mean)
> >> >> >
> >> >> > Jim
> >> >>
> >> >> Hi Md,
> >> >>
> >> >> you can also define the period in a new
> >> >> column, and use aggregate like this:
> >> >>
> >> >>         Md <- structure(list(
> >> >>         Year = c(2000L, 2000L, 2000L,
> >> >>         2000L, 2001L, 2018L, 2018L),
> >> >>         Month = c(1L, 2L, 3L, 12L, 1L,
> >> >>         11L, 12L),
> >> >>         Value = c(25L, 28L, 22L, 26L,
> >> >>         27L, 30L, 29L)),
> >> >>         class = "data.frame",
> >> >>         row.names = c(NA, -7L))
> >> >>
> >> >>         Md[Md$Month %in%
> >> >>                 1:6,"Period"] <- "first six months of the year"
> >> >>         Md[Md$Month %in% 7:12,"Period"] <- "last six months of the
> year"
> >> >>
> >> >>         aggregate(
> >> >>           formula=Value~Year+Period,
> >> >>           data=Md,
> >> >>           FUN=mean)
> >> >>
> >> >> Rasmus
> >> >
> >> >
> >> >
> >
> >
> >
>


--
Best Regards,
Md. Moyazzem Hossain
Associate Professor
Department of Statistics
Jahangirnagar University
Savar, Dhaka-1342
Bangladesh
Website: http://www.juniv.edu/teachers/hossainmm
Research: *Google Scholar
<https://scholar.google.com/citations?user=-U03XCgAAAAJ&hl=en&oi=ao>*;
*ResearchGate
<https://www.researchgate.net/profile/Md_Hossain107>*; *ORCID iD
<https://orcid.org/0000-0003-3593-6936>*

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.