Arrange two columns into a five variable dataframe

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Arrange two columns into a five variable dataframe

David Arnold
This post was updated on .
Hi,

I hope that folks can give me some simple approaches to taking the data set below, which is accumulated in two columns called "long" and "group", then arrange the data in the "long" column into a data frame containing five variables: "Group 1", "Group 2", "Group 3", "Group 4", and "Group 5".  I am hoping for a few different techniques which I can pass on to my students.

Thanks

David Arnold
College of the Redwoods


> dput(flies)
structure(list(long = c(40L, 37L, 44L, 47L, 47L, 47L, 68L, 47L,
54L, 61L, 71L, 75L, 89L, 58L, 59L, 62L, 79L, 96L, 58L, 62L, 70L,
72L, 74L, 96L, 75L, 46L, 42L, 65L, 46L, 58L, 42L, 48L, 58L, 50L,
80L, 63L, 65L, 70L, 70L, 72L, 97L, 46L, 56L, 70L, 70L, 72L, 76L,
90L, 76L, 92L, 21L, 40L, 44L, 54L, 36L, 40L, 56L, 60L, 48L, 53L,
60L, 60L, 65L, 68L, 60L, 81L, 81L, 48L, 48L, 56L, 68L, 75L, 81L,
48L, 68L, 35L, 37L, 49L, 46L, 63L, 39L, 46L, 56L, 63L, 65L, 56L,
65L, 70L, 63L, 65L, 70L, 77L, 81L, 86L, 70L, 70L, 77L, 77L, 81L,
77L, 16L, 19L, 19L, 32L, 33L, 33L, 30L, 42L, 42L, 33L, 26L, 30L,
40L, 54L, 34L, 34L, 47L, 47L, 42L, 47L, 54L, 54L, 56L, 60L, 44L
), group = structure(c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = c("Group 5", "Group 4", "Group 3", "Group 2",
"Group 1"), class = "factor")), .Names = c("long", "group"), row.names = c(NA,
-125L), class = "data.frame")
Reply | Threaded
Open this post in threaded view
|

Re: Arrange two columns into a five variable dataframe

Brian Diggs
On 7/13/2012 8:37 PM, darnold wrote:

> Hi,
>
> I hope that folks can give me some simple approaches to taking the data set
> below, which is accumulated in two columns called "long" and "group", then
> arrange the data is the "long" column into a data frame containing five
> variables: "Group 1", "Group 2", "Group 3", "Group 4", and "Group 5".  I am
> hoping for a few different techniques which I can pass on to my students.
>
> Thanks
>
> David Arnold
> College of the Redwoods
>
>
>> dput(flies)
> structure(list(long = c(40L, 37L, 44L, 47L, 47L, 47L, 68L, 47L,
> 54L, 61L, 71L, 75L, 89L, 58L, 59L, 62L, 79L, 96L, 58L, 62L, 70L,
> 72L, 74L, 96L, 75L, 46L, 42L, 65L, 46L, 58L, 42L, 48L, 58L, 50L,
> 80L, 63L, 65L, 70L, 70L, 72L, 97L, 46L, 56L, 70L, 70L, 72L, 76L,
> 90L, 76L, 92L, 21L, 40L, 44L, 54L, 36L, 40L, 56L, 60L, 48L, 53L,
> 60L, 60L, 65L, 68L, 60L, 81L, 81L, 48L, 48L, 56L, 68L, 75L, 81L,
> 48L, 68L, 35L, 37L, 49L, 46L, 63L, 39L, 46L, 56L, 63L, 65L, 56L,
> 65L, 70L, 63L, 65L, 70L, 77L, 81L, 86L, 70L, 70L, 77L, 77L, 81L,
> 77L, 16L, 19L, 19L, 32L, 33L, 33L, 30L, 42L, 42L, 33L, 26L, 30L,
> 40L, 54L, 34L, 34L, 47L, 47L, 42L, 47L, 54L, 54L, 56L, 60L, 44L
> ), group = structure(c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
> 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L,
> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L), .Label = c("Group 5", "Group 4", "Group 3", "Group 2",
> "Group 1"), class = "factor")), .Names = c("long", "group"), row.names =
> c(NA,
> -125L), class = "data.frame")

Generally I would recommend either the reshape function or the functions
in the reshape2 package. However, your data doesn't quite have what is
needed to use those. You are implicitly assuming that the first
occurring values in each group go together (should be in the same row),
the second ones, etc.  The reshapes require an explicit indication of
which variables go together.

The unstack function will work for you and uses the same assumption.

 > unstack(flies)
    Group.5 Group.4 Group.3 Group.2 Group.1
1       16      35      21      46      40
2       19      37      40      42      37
3       19      49      44      65      44
4       32      46      54      46      47
5       33      63      36      58      47
6       33      39      40      42      47
7       30      46      56      48      68
8       42      56      60      58      47
9       42      63      48      50      54
10      33      65      53      80      61
11      26      56      60      63      71
12      30      65      60      65      75
13      40      70      65      70      89
14      54      63      68      70      58
15      34      65      60      72      59
16      34      70      81      97      62
17      47      77      81      46      79
18      47      81      48      56      96
19      42      86      48      70      58
20      47      70      56      70      62
21      54      70      68      72      70
22      54      77      75      76      72
23      56      77      81      90      74
24      60      81      48      76      96
25      44      77      68      92      75



--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Arrange two columns into a five variable dataframe

arun kirshna
In reply to this post by David Arnold


Hi,

You could use either one of these methods:
#Method 1:

#dat1 : data

list1<-split(dat1,dat1$group)
dat2<-data.frame(list1)
dat2<-data.frame(list1[[5]][1],list1[[4]][1],list1[[3]][1],list1[[2]][1],list1[[1]][1])
colnames(dat2)<-c(rev(levels(dat1$group)))
head(dat2)
  Group 1 Group 2 Group 3 Group 4 Group 5
1      40      46      21      35      16
2      37      42      40      37      19
3      44      65      44      49      19
4      47      46      54      46      32
5      47      58      36      63      33
6      47      42      40      39      33


#Method 2:
#dat1:data
library(reshape)

dat3<-data.frame(dat1,ID=rep(1:25,5))
dat4<-reshape(dat3,idvar="ID",timevar="group",direction="wide")
dat4<-dat4[,-1]
colnames(dat4)<-rev(levels(dat3$group))
head(dat4)
   Group 1 Group 2 Group 3 Group 4 Group 5
1      40      46      21      35      16
2      37      42      40      37      19
3      44      65      44      49      19
4      47      46      54      46      32
5      47      58      36      63      33
6      47      42      40      39      33


#Method 3:
#dat1: data
dat3<-data.frame(dat1,ID=rep(1:25,5))
library(reshape2)

dat5<-dcast(melt(dat3,id.vars=c("ID","group")),ID~variable+group)

dat5<-dat5[,-1]
colnames(dat5)<-levels(dat3$group)
dat5<-dat5[,c(5:1)]
head(dat5)

 Group 1 Group 2 Group 3 Group 4 Group 5
1      40      46      21      35      16
2      37      42      40      37      19
3      44      65      44      49      19
4      47      46      54      46      32
5      47      58      36      63      33
6      47      42      40      39      33




> identical(dat2,dat4)
[1] TRUE
> identical(dat2,dat5)
[1] TRUE


A.K.







----- Original Message -----
From: darnold <[hidden email]>
To: [hidden email]
Cc:
Sent: Friday, July 13, 2012 11:37 PM
Subject: [R] Arrange two columns into a five variable dataframe

Hi,

I hope that folks can give me some simple approaches to taking the data set
below, which is accumulated in two columns called "long" and "group", then
arrange the data is the "long" column into a data frame containing five
variables: "Group 1", "Group 2", "Group 3", "Group 4", and "Group 5".  I am
hoping for a few different techniques which I can pass on to my students.

Thanks

David Arnold
College of the Redwoods


> dput(flies)
structure(list(long = c(40L, 37L, 44L, 47L, 47L, 47L, 68L, 47L,
54L, 61L, 71L, 75L, 89L, 58L, 59L, 62L, 79L, 96L, 58L, 62L, 70L,
72L, 74L, 96L, 75L, 46L, 42L, 65L, 46L, 58L, 42L, 48L, 58L, 50L,
80L, 63L, 65L, 70L, 70L, 72L, 97L, 46L, 56L, 70L, 70L, 72L, 76L,
90L, 76L, 92L, 21L, 40L, 44L, 54L, 36L, 40L, 56L, 60L, 48L, 53L,
60L, 60L, 65L, 68L, 60L, 81L, 81L, 48L, 48L, 56L, 68L, 75L, 81L,
48L, 68L, 35L, 37L, 49L, 46L, 63L, 39L, 46L, 56L, 63L, 65L, 56L,
65L, 70L, 63L, 65L, 70L, 77L, 81L, 86L, 70L, 70L, 77L, 77L, 81L,
77L, 16L, 19L, 19L, 32L, 33L, 33L, 30L, 42L, 42L, 33L, 26L, 30L,
40L, 54L, 34L, 34L, 47L, 47L, 42L, 47L, 54L, 54L, 56L, 60L, 44L
), group = structure(c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = c("Group 5", "Group 4", "Group 3", "Group 2",
"Group 1"), class = "factor")), .Names = c("long", "group"), row.names =
c(NA,
-125L), class = "data.frame")

--
View this message in context: http://r.789695.n4.nabble.com/Arrange-two-columns-into-a-five-variable-dataframe-tp4636503.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.