Sequential date by group

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Sequential date by group

Miluji Sb
Dear all,

I have a panel data with a large number of groups and the cumulative number
of months (1 - 372) for January 1995 to December 2005. My goal is to
extract the corresponding month and year for each observation.

I tried the following;

###
x %>%
  group_by(id) %>%
  do( data.frame(., Date= seq(.$startdate,
                              as.Date('1975-01-01'), by = '1 month')))

However, I get the following error;

###
Error in seq.default(.$startdate, as.Date("1975-01-01"), by = "1 month") :
  'from' must be of length 1

Essentially, I want to convert the month number (1-372) to January 1975 to
December 2005 by ID. Is that possible? Any help will be highly appreciated.

### Data ###
x <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), month =
c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), startdate = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1975-01-01", class = "factor"),
    enddate = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label =
"2005-12-31", class = "factor")), class = "data.frame", row.names = c(NA,
-9L))

Cross-posted in Statalist - however, Stata seems to have a very inflexible
date-time structure.

Best regards,

Milu

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sequential date by group

Eric Berger
dtV <- seq(from=as.Date("1975-01-01"),by='1 month',length=372)
x$dt <- dtV[x$id]

HTH,
Eric


On Mon, Sep 2, 2019 at 8:36 PM Miluji Sb <[hidden email]> wrote:

> Dear all,
>
> I have a panel data with a large number of groups and the cumulative number
> of months (1 - 372) for January 1995 to December 2005. My goal is to
> extract the corresponding month and year for each observation.
>
> I tried the following;
>
> ###
> x %>%
>   group_by(id) %>%
>   do( data.frame(., Date= seq(.$startdate,
>                               as.Date('1975-01-01'), by = '1 month')))
>
> However, I get the following error;
>
> ###
> Error in seq.default(.$startdate, as.Date("1975-01-01"), by = "1 month") :
>   'from' must be of length 1
>
> Essentially, I want to convert the month number (1-372) to January 1975 to
> December 2005 by ID. Is that possible? Any help will be highly appreciated.
>
> ### Data ###
> x <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), month =
> c(1L,
> 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), startdate = structure(c(1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1975-01-01", class = "factor"),
>     enddate = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label =
> "2005-12-31", class = "factor")), class = "data.frame", row.names = c(NA,
> -9L))
>
> Cross-posted in Statalist - however, Stata seems to have a very inflexible
> date-time structure.
>
> Best regards,
>
> Milu
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sequential date by group

Jeff Newmiller
In reply to this post by Miluji Sb
Your sample data has startdate and enddate columns as though these values
might vary throughout the data set, but your description suggests that
those values are fixed for the whole dataset.

If those values are really constant, then

x$Year <- 1995L + 12 * ( ( x$month - 1L ) %/% 12L )
x$Month <- ( x$month - 1L ) %% 12L + 1

should be enough.

If in fact your startdate can be different for different IDs, then you
would need to wrap this up a bit:

############ base R
x$startdate <- as.Date( as.character( x$startdate ) )
x$Y <- as.integer( format( DF$startdate, format = "%Y" ) )
xlist <- split( x, x$id )
xlist1 <- lapply( xlist
                 , FUN = function( DF ) {
                     DF$Year <- Y + 12 * ( ( DF$month - 1L ) %/% 12L )
                     DF$Month <- ( DF$month - 1L ) %% 12L + 1
                     DF
                   }
                 )
x1 <- unsplit( xlist1, x$id )

###### dplyr
library(dplyr)
x2 <- (   x
       %>% mutate( Y = as.integer( format( startdate, format = "%Y" ) ) )
       %>% group_by( id )
       %>% mutate( Year = Y + 12L * ( ( month - 1L ) %/% 12L )
                 , Month = ( month - 1L ) %% 12L + 1L
                 )
       %>% ungroup
       )
#########

On Mon, 2 Sep 2019, Miluji Sb wrote:

> Dear all,
>
> I have a panel data with a large number of groups and the cumulative number
> of months (1 - 372) for January 1995 to December 2005. My goal is to
> extract the corresponding month and year for each observation.
>
> I tried the following;
>
> ###
> x %>%
>  group_by(id) %>%
>  do( data.frame(., Date= seq(.$startdate,
>                              as.Date('1975-01-01'), by = '1 month')))
>
> However, I get the following error;
>
> ###
> Error in seq.default(.$startdate, as.Date("1975-01-01"), by = "1 month") :
>  'from' must be of length 1
>
> Essentially, I want to convert the month number (1-372) to January 1975 to
> December 2005 by ID. Is that possible? Any help will be highly appreciated.
>
> ### Data ###
> x <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), month =
> c(1L,
> 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), startdate = structure(c(1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1975-01-01", class = "factor"),
>    enddate = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label =
> "2005-12-31", class = "factor")), class = "data.frame", row.names = c(NA,
> -9L))
>
> Cross-posted in Statalist - however, Stata seems to have a very inflexible
> date-time structure.
>
> Best regards,
>
> Milu
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Sequential date by group

Miluji Sb
Thanks, Jeff. I am trying your solutions - appreciate your help!

Best,

Milu

On Tue, Sep 3, 2019 at 12:51 AM Jeff Newmiller <[hidden email]>
wrote:

> Your sample data has startdate and enddate columns as though these values
> might vary throughout the data set, but your description suggests that
> those values are fixed for the whole dataset.
>
> If those values are really constant, then
>
> x$Year <- 1995L + 12 * ( ( x$month - 1L ) %/% 12L )
> x$Month <- ( x$month - 1L ) %% 12L + 1
>
> should be enough.
>
> If in fact your startdate can be different for different IDs, then you
> would need to wrap this up a bit:
>
> ############ base R
> x$startdate <- as.Date( as.character( x$startdate ) )
> x$Y <- as.integer( format( DF$startdate, format = "%Y" ) )
> xlist <- split( x, x$id )
> xlist1 <- lapply( xlist
>                  , FUN = function( DF ) {
>                      DF$Year <- Y + 12 * ( ( DF$month - 1L ) %/% 12L )
>                      DF$Month <- ( DF$month - 1L ) %% 12L + 1
>                      DF
>                    }
>                  )
> x1 <- unsplit( xlist1, x$id )
>
> ###### dplyr
> library(dplyr)
> x2 <- (   x
>        %>% mutate( Y = as.integer( format( startdate, format = "%Y" ) ) )
>        %>% group_by( id )
>        %>% mutate( Year = Y + 12L * ( ( month - 1L ) %/% 12L )
>                  , Month = ( month - 1L ) %% 12L + 1L
>                  )
>        %>% ungroup
>        )
> #########
>
> On Mon, 2 Sep 2019, Miluji Sb wrote:
>
> > Dear all,
> >
> > I have a panel data with a large number of groups and the cumulative
> number
> > of months (1 - 372) for January 1995 to December 2005. My goal is to
> > extract the corresponding month and year for each observation.
> >
> > I tried the following;
> >
> > ###
> > x %>%
> >  group_by(id) %>%
> >  do( data.frame(., Date= seq(.$startdate,
> >                              as.Date('1975-01-01'), by = '1 month')))
> >
> > However, I get the following error;
> >
> > ###
> > Error in seq.default(.$startdate, as.Date("1975-01-01"), by = "1 month")
> :
> >  'from' must be of length 1
> >
> > Essentially, I want to convert the month number (1-372) to January 1975
> to
> > December 2005 by ID. Is that possible? Any help will be highly
> appreciated.
> >
> > ### Data ###
> > x <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), month =
> > c(1L,
> > 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), startdate = structure(c(1L,
> > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1975-01-01", class =
> "factor"),
> >    enddate = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label =
> > "2005-12-31", class = "factor")), class = "data.frame", row.names = c(NA,
> > -9L))
> >
> > Cross-posted in Statalist - however, Stata seems to have a very
> inflexible
> > date-time structure.
> >
> > Best regards,
> >
> > Milu
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live
> Go...
>                                        Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.