Reshape from long to wide format with date variable

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Reshape from long to wide format with date variable

Pete Pete
Hi,

I need to reshape my dataframe from a long format to a wide format. Unfortunately, I have a continuous date variable which gives me headaches.

Consider the following example:
> id=c("034","034","016","016","016","340","340")
> date=as.Date(c("1997-09-28", "1997-10-06", "1997-11-04", "2000-09-27", "2003-07-20", "1997-11-08", "1997-11-08"))
> ref=c("2","2","1","1","2","1","1")
> data1=data.frame(id,date,ref)
> data1
   id       date ref
1 034 1997-09-28   2
2 034 1997-10-06   2
3 016 1997-11-04   1
4 016 2000-09-27   1
5 016 2003-07-20   2
6 340 1997-11-08   1
7 340 1997-11-08   1


I would like to have it like this:
> data2
   id      date1      date2      date3 ref1 ref2 ref3
1 034 1997-09-28 1997-10-06         NA    2    2   NA
2 016 1997-11-04 2000-09-27 2003-07-20    1    1    2
3 340 1997-11-08 1997-11-08         NA    1    1   NA

All I tried the reshape package but ended up in multiple variables for each of the dates and that is not what I would like to have.

Thanks for you help.
Reply | Threaded
Open this post in threaded view
|

Re: Reshape from long to wide format with date variable

Joshua Wiley-2
Hi Pete,

Try the reshape function (see ?reshape for documentation).  It can be
a bit confusing, but its worth learning if you deal with multiple
observations per unit much.  Code inline does what you want (though
you might need a bit of tweaking to get pretty names, etc.

HTH,

Josh

On Wed, Jul 6, 2011 at 6:40 AM, Pete Pete <[hidden email]> wrote:

> Hi,
>
> I need to reshape my dataframe from a long format to a wide format.
> Unfortunately, I have a continuous date variable which gives me headaches.
>
> Consider the following example:
>> id=c("034","034","016","016","016","340","340")
>> date=as.Date(c("1997-09-28", "1997-10-06", "1997-11-04", "2000-09-27",
>> "2003-07-20", "1997-11-08", "1997-11-08"))
>> ref=c("2","2","1","1","2","1","1")
>> data1=data.frame(id,date,ref)

## create time variable
data1$time <- with(data1, ave(1:nrow(data1), id, FUN = seq_along))

wdata1 <- reshape(data1, idvar = "id", timevar = "time", direction = "wide")
> wdata1
   id     date.1 ref.1     date.2 ref.2     date.3 ref.3
1 034 1997-09-28     2 1997-10-06     2       <NA>  <NA>
3 016 1997-11-04     1 2000-09-27     1 2003-07-20     2
6 340 1997-11-08     1 1997-11-08     1       <NA>  <NA>



>> data1
>   id       date ref
> 1 034 1997-09-28   2
> 2 034 1997-10-06   2
> 3 016 1997-11-04   1
> 4 016 2000-09-27   1
> 5 016 2003-07-20   2
> 6 340 1997-11-08   1
> 7 340 1997-11-08   1
>
>
> I would like to have it like this:
>> data2
>   id      date1      date2      date3 ref1 ref2 ref3
> 1 034 1997-09-28 1997-10-06         NA    2    2   NA
> 2 016 1997-11-04 2000-09-27 2003-07-20    1    1    2
> 3 340 1997-11-08 1997-11-08         NA    1    1   NA
>
> All I tried the reshape package but ended up in multiple variables for each
> of the dates and that is not what I would like to have.
>
> Thanks for you help.
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Reshape-from-long-to-wide-format-with-date-variable-tp3648833p3648833.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
https://joshuawiley.com/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reshape from long to wide format with date variable

djmuseR
In reply to this post by Pete Pete
Hi:

Here's one way with the reshape package. I converted ref to numeric
and date to character string first. Sometimes these little things
matter...

library(plyr)
library(reshape)

# Modified original data; note the option in the data.frame() statement
id=c("034","034","016","016","016","340","340")
date=c("1997-09-28", "1997-10-06", "1997-11-04", "2000-09-27",
               "2003-07-20", "1997-11-08", "1997-11-08")
ref=c(2, 2, 1, 1, 2, 1, 1)
data1=data.frame(id, date, ref, stringsAsFactors = FALSE)

# Add a new variable named occasion within id
data2 <- ddply(data1, .(id), transform, occasion = seq_along(id))

# Use the cast() function in reshape twice, adjust names in each
c1 <- cast(data2, id ~ occasion, value = 'date')
c2 <- cast(data2, id ~ occasion, value = 'ref')
names(c1)[-1] <- paste('date', 1:3, sep = '')
names(c2)[-1] <- paste('ref', 1:3, sep = '')

# merge c1 and c2 by id:
merge(c1, c2, by = 'id')

The cast() function sets the rows to be ids, the columns to be
occasion and value to be the name of the variable whose values should
fill the cells attributable to id * occasion combinations.

HTH,
Dennis

On Wed, Jul 6, 2011 at 6:40 AM, Pete Pete <[hidden email]> wrote:

> Hi,
>
> I need to reshape my dataframe from a long format to a wide format.
> Unfortunately, I have a continuous date variable which gives me headaches.
>
> Consider the following example:
>> id=c("034","034","016","016","016","340","340")
>> date=as.Date(c("1997-09-28", "1997-10-06", "1997-11-04", "2000-09-27",
>> "2003-07-20", "1997-11-08", "1997-11-08"))
>> ref=c("2","2","1","1","2","1","1")
>> data1=data.frame(id,date,ref)
>> data1
>   id       date ref
> 1 034 1997-09-28   2
> 2 034 1997-10-06   2
> 3 016 1997-11-04   1
> 4 016 2000-09-27   1
> 5 016 2003-07-20   2
> 6 340 1997-11-08   1
> 7 340 1997-11-08   1
>
>
> I would like to have it like this:
>> data2
>   id      date1      date2      date3 ref1 ref2 ref3
> 1 034 1997-09-28 1997-10-06         NA    2    2   NA
> 2 016 1997-11-04 2000-09-27 2003-07-20    1    1    2
> 3 340 1997-11-08 1997-11-08         NA    1    1   NA
>
> All I tried the reshape package but ended up in multiple variables for each
> of the dates and that is not what I would like to have.
>
> Thanks for you help.
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Reshape-from-long-to-wide-format-with-date-variable-tp3648833p3648833.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Reshape from long to wide format with date variable

Pete Pete
In reply to this post by Joshua Wiley-2
Thanks, Josh!
The index variable (time) was my problem. My R skills are too low! :)
Problem solved!