Alternative and more efficient data manipulation

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Alternative and more efficient data manipulation

Sam Albers
Hello list,

## I have been doing the following process to convert data from one
form to another for a while but it occurs to me that there is probably
an easier way to do this. I am often given data that have column names
which are actually data and I much prefer dealing with data that are
sorted by factors. So to convert the columns I have previously made
use of make.groups() in the lattice package which works completely
satisfactorily. However, it is a bit clunky for what I am using it for
and I have to carry the other variables forward. Can anyone suggest a
better way of converting data like this?

library(lattice)

dat <- data.frame(`x1`=runif(6, 0, 125),
                  `x2`=runif(6, 50, 75),
                  `x3`=runif(6, 0, 100),
                  `x4`=runif(6, 0, 200),
                  date =
as.Date(c("2009-09-25","2009-09-28","2009-10-02","2009-10-07","2009-10-15","2009-10-21")),
                  yy= head(letters,2), check.names=FALSE)
## Here is an example of the type of data that NEED converting
dat

dat.group <- with(dat, make.groups(x1,x2,x3,x4))
## Carrying the other variables forward
dat.group$date <- dat$date
dat.group$yy <- dat$yy
## Here is an example of what I would like the data to look like
dat.group

## The point of this all is so that I can used the data in a manner
such as this:
with(dat.group, xyplot(data ~ as.numeric(substr(which, 2,2))|yy, groups=date))

## So I suppose what I am asking is if there is a more efficient way
of doing this?

Thanks so much in advance!

Sam

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Alternative and more efficient data manipulation

Mikhail Titov-2
?reshape

You have your data in a wide format, but you want it in a long format.
reshape can convert it both ways.

Mikhail


> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]]
On

> Behalf Of Sam Albers
> Sent: Monday, August 15, 2011 6:58 PM
> To: [hidden email]
> Subject: [R] Alternative and more efficient data manipulation
>
> Hello list,
>
> ## I have been doing the following process to convert data from one
> form to another for a while but it occurs to me that there is probably
> an easier way to do this. I am often given data that have column names
> which are actually data and I much prefer dealing with data that are
> sorted by factors. So to convert the columns I have previously made
> use of make.groups() in the lattice package which works completely
> satisfactorily. However, it is a bit clunky for what I am using it for
> and I have to carry the other variables forward. Can anyone suggest a
> better way of converting data like this?
>
> library(lattice)
>
> dat <- data.frame(`x1`=runif(6, 0, 125),
>                   `x2`=runif(6, 50, 75),
>                   `x3`=runif(6, 0, 100),
>                   `x4`=runif(6, 0, 200),
>                   date =
> as.Date(c("2009-09-25","2009-09-28","2009-10-02","2009-10-07","2009-10-
> 15","2009-10-21")),
>                   yy= head(letters,2), check.names=FALSE)
> ## Here is an example of the type of data that NEED converting
> dat
>
> dat.group <- with(dat, make.groups(x1,x2,x3,x4))
> ## Carrying the other variables forward
> dat.group$date <- dat$date
> dat.group$yy <- dat$yy
> ## Here is an example of what I would like the data to look like
> dat.group
>
> ## The point of this all is so that I can used the data in a manner
> such as this:
> with(dat.group, xyplot(data ~ as.numeric(substr(which, 2,2))|yy,
> groups=date))
>
> ## So I suppose what I am asking is if there is a more efficient way
> of doing this?
>
> Thanks so much in advance!
>
> Sam
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Alternative and more efficient data manipulation

djmuseR
In reply to this post by Sam Albers
Hi:

As previously mentioned, the reshape package (in particular, melt())
is an alternative to make.groups(), although they serve the same
purpose in this example:

dat <- data.frame(`x1`=runif(6, 0, 125),
                 `x2`=runif(6, 50, 75),
                 `x3`=runif(6, 0, 100),
                 `x4`=runif(6, 0, 200),
                 date = as.Date(c("2009-09-25","2009-09-28","2009-10-02",
                                  "2009-10-07","2009-10-15","2009-10-21")),
                 yy= head(letters,2), check.names=FALSE)

# id variables are not melted, but carried along in parallel
library('reshape')
mdat <- melt(dat, id = c('date', 'yy'))
# Create a new variable in the data frame rather than try to
# create it in the plot call
mdat <- within(mdat, val = as.numeric(substr(variable, 2, 2)))

# Here are a couple of plot examples that appear to be close
# to what you want to do:

# library('lattice')
# library('ggplot2')

# Create a key for dates
mkey <- list(space = 'top', columns = 2,
             title = 'Date', cex.title = 1,
             text = list(as.character(unique(mdat$date))),
             points = list(pch = 16, col = 1:6),
             lines = list(lty = 1, col = 1:6))
xyplot(value ~ val | yy, data = mdat, groups = date,
       pch = 16, col = 1:6, col.line = 1:6,
       type = c('p', 'l'), key = mkey)

# Similar type of plot in ggplot2 (legend at right instead)
ggplot(mdat, aes(x = val, y = value, colour = factor(date))) +
     geom_point(size = 2.5) + geom_line(aes(group = date), size = 1) +
     facet_wrap( ~ yy, ncol = 2) +
     scale_colour_brewer('Date', pal = 'Dark2')

HTH,
Dennis

On Mon, Aug 15, 2011 at 4:57 PM, Sam Albers <[hidden email]> wrote:

> Hello list,
>
> ## I have been doing the following process to convert data from one
> form to another for a while but it occurs to me that there is probably
> an easier way to do this. I am often given data that have column names
> which are actually data and I much prefer dealing with data that are
> sorted by factors. So to convert the columns I have previously made
> use of make.groups() in the lattice package which works completely
> satisfactorily. However, it is a bit clunky for what I am using it for
> and I have to carry the other variables forward. Can anyone suggest a
> better way of converting data like this?
>
> library(lattice)
>
> dat <- data.frame(`x1`=runif(6, 0, 125),
>                  `x2`=runif(6, 50, 75),
>                  `x3`=runif(6, 0, 100),
>                  `x4`=runif(6, 0, 200),
>                  date =
> as.Date(c("2009-09-25","2009-09-28","2009-10-02","2009-10-07","2009-10-15","2009-10-21")),
>                  yy= head(letters,2), check.names=FALSE)
> ## Here is an example of the type of data that NEED converting
> dat
>
> dat.group <- with(dat, make.groups(x1,x2,x3,x4))
> ## Carrying the other variables forward
> dat.group$date <- dat$date
> dat.group$yy <- dat$yy
> ## Here is an example of what I would like the data to look like
> dat.group
>
> ## The point of this all is so that I can used the data in a manner
> such as this:
> with(dat.group, xyplot(data ~ as.numeric(substr(which, 2,2))|yy, groups=date))
>
> ## So I suppose what I am asking is if there is a more efficient way
> of doing this?
>
> Thanks so much in advance!
>
> Sam
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.