|
|
I seem to be doing something really stupid or missing something really obvious but what?
I have a simple three column data.frame that I would like to reshape to wide preferably using reshape2.
An example from http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix looked perfect except I wanted a data frame but it seemed okay. I just changed acast to dcast and it seems fine.
Stackoverflow example:
tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
y=gl(3,1,6, labels=letters[1:3]),
z=c(1,2,3,3,3,2))
dd <- dcast(tmp, x~y, value.var="z")
My Example: Does NOT work
md2 <- structure(list(group = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,
4L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L), .Label = c("X1",
"X2", "X3", "X4", "X5", "X6", "X7", "X8"), class = "factor"),
tps = structure(c(7L, 12L, 14L, 4L, 8L, 9L, 16L, 6L, 7L,
11L, 6L, 15L, 10L, 13L, 3L, 4L, 5L, 1L, 2L), .Label = c("A",
"C", "D", "E", "G", "I", "L", "M", "N", "P", "Q", "R", "S",
"T", "V", "Y"), class = "factor"), sum = c(0.914913196595112,
0.0367565080432513, 0.0483302953616366, 0.982727803634948,
0.0172721963650521, 0.0483302953616366, 0.951669704638363,
0.89764100023006, 0.0850868034048879, 0.0172721963650521,
0.951669704638363, 0.0483302953616366, 0.963243491956749,
0.0367565080432513, 0.89764100023006, 0.0540287044083034,
0.0483302953616366, 0.982727803634948, 0.0172721963650521
)), .Names = c("group", "tps", "sum"), row.names = c(NA,
-19L), class = "data.frame")
dcast(md2, group ~ tps , value.vars = "sum")
What am I doing wrong?
John Kane
Kingston ON Canada
____________________________________________________________
FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family!
Visit http://www.inbox.com/photosharing to find out more!
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
Did you melt first?
library(reshape2)
?melt
> Date: Tue, 24 Jul 2012 09:17:08 -0800
> From: [hidden email]
> To: [hidden email]
> Subject: [R] Simple reshape problem I am completely missing
>
> I seem to be doing something really stupid or missing something really obvious but what?
>
> I have a simple three column data.frame that I would like to reshape to wide preferably using reshape2.
>
> An example from http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix looked perfect except I wanted a data frame but it seemed okay. I just changed acast to dcast and it seems fine.
>
> Stackoverflow example:
> tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
> y=gl(3,1,6, labels=letters[1:3]),
> z=c(1,2,3,3,3,2))
> dd <- dcast(tmp, x~y, value.var="z")
>
> My Example: Does NOT work
>
> md2 <- structure(list(group = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,
> 4L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L), .Label = c("X1",
> "X2", "X3", "X4", "X5", "X6", "X7", "X8"), class = "factor"),
> tps = structure(c(7L, 12L, 14L, 4L, 8L, 9L, 16L, 6L, 7L,
> 11L, 6L, 15L, 10L, 13L, 3L, 4L, 5L, 1L, 2L), .Label = c("A",
> "C", "D", "E", "G", "I", "L", "M", "N", "P", "Q", "R", "S",
> "T", "V", "Y"), class = "factor"), sum = c(0.914913196595112,
> 0.0367565080432513, 0.0483302953616366, 0.982727803634948,
> 0.0172721963650521, 0.0483302953616366, 0.951669704638363,
> 0.89764100023006, 0.0850868034048879, 0.0172721963650521,
> 0.951669704638363, 0.0483302953616366, 0.963243491956749,
> 0.0367565080432513, 0.89764100023006, 0.0540287044083034,
> 0.0483302953616366, 0.982727803634948, 0.0172721963650521
> )), .Names = c("group", "tps", "sum"), row.names = c(NA,
> -19L), class = "data.frame")
>
> dcast(md2, group ~ tps , value.vars = "sum")
>
>
> What am I doing wrong?
>
> John Kane
> Kingston ON Canada
>
> ____________________________________________________________
> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family!
> Visit http://www.inbox.com/photosharing to find out more!
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
HI,
I think your group and tps variables have a lot of levels, so the combinations which are not there might end up as NA.
Try this:
md4<-data.frame(group=c(rep("X1",3),rep("X2",3)),tps=c("L","R","P","L","R","P"),sum=rnorm(6,15))
> md4
group tps sum
1 X1 L 13.94542
2 X1 R 14.34785
3 X1 P 15.31574
4 X2 L 14.50404
5 X2 R 13.73331
6 X2 P 14.69673
> dd <- dcast(md4, group~tps, value.var="sum")
> dd
group L P R
1 X1 13.94542 15.31574 14.34785
2 X2 14.50404 14.69673 13.73331
A.K.
----- Original Message -----
From: John Kane < [hidden email]>
To: [hidden email]
Cc:
Sent: Tuesday, July 24, 2012 1:17 PM
Subject: [R] Simple reshape problem I am completely missing
I seem to be doing something really stupid or missing something really obvious but what?
I have a simple three column data.frame that I would like to reshape to wide preferably using reshape2.
An example from http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix looked perfect except I wanted a data frame but it seemed okay. I just changed acast to dcast and it seems fine.
Stackoverflow example:
tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
y=gl(3,1,6, labels=letters[1:3]),
z=c(1,2,3,3,3,2))
dd <- dcast(tmp, x~y, value.var="z")
My Example: Does NOT work
md2 <- structure(list(group = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,
4L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L), .Label = c("X1",
"X2", "X3", "X4", "X5", "X6", "X7", "X8"), class = "factor"),
tps = structure(c(7L, 12L, 14L, 4L, 8L, 9L, 16L, 6L, 7L,
11L, 6L, 15L, 10L, 13L, 3L, 4L, 5L, 1L, 2L), .Label = c("A",
"C", "D", "E", "G", "I", "L", "M", "N", "P", "Q", "R", "S",
"T", "V", "Y"), class = "factor"), sum = c(0.914913196595112,
0.0367565080432513, 0.0483302953616366, 0.982727803634948,
0.0172721963650521, 0.0483302953616366, 0.951669704638363,
0.89764100023006, 0.0850868034048879, 0.0172721963650521,
0.951669704638363, 0.0483302953616366, 0.963243491956749,
0.0367565080432513, 0.89764100023006, 0.0540287044083034,
0.0483302953616366, 0.982727803634948, 0.0172721963650521
)), .Names = c("group", "tps", "sum"), row.names = c(NA,
-19L), class = "data.frame")
dcast(md2, group ~ tps , value.vars = "sum")
What am I doing wrong?
John Kane
Kingston ON Canada
____________________________________________________________
FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family!
Visit http://www.inbox.com/photosharing to find out more!
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
The data is already in a "long" form, it doesn't require melting. The
problem was simply that you used the wrong argument name: change
"value.vars" to "value.var", i.e.,
dcast(md2, group ~ tps , value.var = "sum")
should work fine.
Best,
Ista
On Tue, Jul 24, 2012 at 1:23 PM, jose Bartolomei < [hidden email]> wrote:
>
> Did you melt first?
>
> library(reshape2)
> ?melt
>
>> Date: Tue, 24 Jul 2012 09:17:08 -0800
>> From: [hidden email]
>> To: [hidden email]
>> Subject: [R] Simple reshape problem I am completely missing
>>
>> I seem to be doing something really stupid or missing something really obvious but what?
>>
>> I have a simple three column data.frame that I would like to reshape to wide preferably using reshape2.
>>
>> An example from http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix looked perfect except I wanted a data frame but it seemed okay. I just changed acast to dcast and it seems fine.
>>
>> Stackoverflow example:
>> tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
>> y=gl(3,1,6, labels=letters[1:3]),
>> z=c(1,2,3,3,3,2))
>> dd <- dcast(tmp, x~y, value.var="z")
>>
>> My Example: Does NOT work
>>
>> md2 <- structure(list(group = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,
>> 4L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L), .Label = c("X1",
>> "X2", "X3", "X4", "X5", "X6", "X7", "X8"), class = "factor"),
>> tps = structure(c(7L, 12L, 14L, 4L, 8L, 9L, 16L, 6L, 7L,
>> 11L, 6L, 15L, 10L, 13L, 3L, 4L, 5L, 1L, 2L), .Label = c("A",
>> "C", "D", "E", "G", "I", "L", "M", "N", "P", "Q", "R", "S",
>> "T", "V", "Y"), class = "factor"), sum = c(0.914913196595112,
>> 0.0367565080432513, 0.0483302953616366, 0.982727803634948,
>> 0.0172721963650521, 0.0483302953616366, 0.951669704638363,
>> 0.89764100023006, 0.0850868034048879, 0.0172721963650521,
>> 0.951669704638363, 0.0483302953616366, 0.963243491956749,
>> 0.0367565080432513, 0.89764100023006, 0.0540287044083034,
>> 0.0483302953616366, 0.982727803634948, 0.0172721963650521
>> )), .Names = c("group", "tps", "sum"), row.names = c(NA,
>> -19L), class = "data.frame")
>>
>> dcast(md2, group ~ tps , value.vars = "sum")
>>
>>
>> What am I doing wrong?
>>
>> John Kane
>> Kingston ON Canada
>>
>> ____________________________________________________________
>> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family!
>> Visit http://www.inbox.com/photosharing to find out more!
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>> and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
Hello,
I think John was asking about the NAs in the output. If you use many levels of both factors and doesn't have many combinations, it will end up in getting NAs
#For example,
mdres <- dcast(md2, tps~group, value.var="sum")
> mdres
tps X1 X2 X3 X4 X5 X6 X7
1 A NA NA NA NA NA NA NA
2 C NA NA NA NA NA NA NA
3 D NA NA NA NA NA NA 0.8976410
4 E NA 0.9827278 NA NA NA NA 0.0540287
5 G NA NA NA NA NA NA 0.0483303
6 I NA NA NA 0.8976410 0.9516697 NA NA
7 L 0.91491320 NA NA 0.0850868 NA NA NA
8 M NA 0.0172722 NA NA NA NA NA
9 N NA NA 0.0483303 NA NA NA NA
10 P NA NA NA NA NA 0.96324349 NA
11 Q NA NA NA 0.0172722 NA NA NA
12 R 0.03675651 NA NA NA NA NA NA
13 S NA NA NA NA NA 0.03675651 NA
14 T 0.04833030 NA NA NA NA NA NA
15 V NA NA NA NA 0.0483303 NA NA
16 Y NA NA 0.9516697 NA NA NA NA
X8
1 0.9827278
2 0.0172722
3 NA
4 NA
5 NA
6 NA
7 NA
8 NA
9 NA
10 NA
11 NA
12 NA
13 NA
14 NA
15 NA
16 NA
###Now consider a set similar to the example.
md4<-data.frame(group=c(rep("X1",3),rep("X2",3)),tps=c("L","R","P","L","R","P"),sum=rnorm(6,15))
> md4
group tps sum
1 X1 L 13.94542
2 X1 R 14.34785
3 X1 P 15.31574
4 X2 L 14.50404
5 X2 R 13.73331
6 X2 P 14.69673
> dd <- dcast(md4, group~tps, value.var="sum")
> dd
group L P R
1 X1 13.94542 15.31574 14.34785
2 X2 14.50404 14.69673 13.73331
A.K.
----- Original Message -----
From: Ista Zahn < [hidden email]>
To: jose Bartolomei < [hidden email]>
Cc: R Help < [hidden email]>
Sent: Tuesday, July 24, 2012 1:56 PM
Subject: Re: [R] Simple reshape problem I am completely missing
The data is already in a "long" form, it doesn't require melting. The
problem was simply that you used the wrong argument name: change
"value.vars" to "value.var", i.e.,
dcast(md2, group ~ tps , value.var = "sum")
should work fine.
Best,
Ista
On Tue, Jul 24, 2012 at 1:23 PM, jose Bartolomei < [hidden email]> wrote:
>
> Did you melt first?
>
> library(reshape2)
> ?melt
>
>> Date: Tue, 24 Jul 2012 09:17:08 -0800
>> From: [hidden email]
>> To: [hidden email]
>> Subject: [R] Simple reshape problem I am completely missing
>>
>> I seem to be doing something really stupid or missing something really obvious but what?
>>
>> I have a simple three column data.frame that I would like to reshape to wide preferably using reshape2.
>>
>> An example from http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix looked perfect except I wanted a data frame but it seemed okay. I just changed acast to dcast and it seems fine.
>>
>> Stackoverflow example:
>> tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
>> y=gl(3,1,6, labels=letters[1:3]),
>> z=c(1,2,3,3,3,2))
>> dd <- dcast(tmp, x~y, value.var="z")
>>
>> My Example: Does NOT work
>>
>> md2 <- structure(list(group = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,
>> 4L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L), .Label = c("X1",
>> "X2", "X3", "X4", "X5", "X6", "X7", "X8"), class = "factor"),
>> tps = structure(c(7L, 12L, 14L, 4L, 8L, 9L, 16L, 6L, 7L,
>> 11L, 6L, 15L, 10L, 13L, 3L, 4L, 5L, 1L, 2L), .Label = c("A",
>> "C", "D", "E", "G", "I", "L", "M", "N", "P", "Q", "R", "S",
>> "T", "V", "Y"), class = "factor"), sum = c(0.914913196595112,
>> 0.0367565080432513, 0.0483302953616366, 0.982727803634948,
>> 0.0172721963650521, 0.0483302953616366, 0.951669704638363,
>> 0.89764100023006, 0.0850868034048879, 0.0172721963650521,
>> 0.951669704638363, 0.0483302953616366, 0.963243491956749,
>> 0.0367565080432513, 0.89764100023006, 0.0540287044083034,
>> 0.0483302953616366, 0.982727803634948, 0.0172721963650521
>> )), .Names = c("group", "tps", "sum"), row.names = c(NA,
>> -19L), class = "data.frame")
>>
>> dcast(md2, group ~ tps , value.vars = "sum")
>>
>>
>> What am I doing wrong?
>>
>> John Kane
>> Kingston ON Canada
>>
>> ____________________________________________________________
>> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family!
>> Visit http://www.inbox.com/photosharing to find out more!
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>> and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
Looks like it. thanks
Now all I have to do is figgure out how to get rid of all the missing combos. Oh well another dat
thanks
John Kane
Kingston ON Canada
> -----Original Message-----
> From: [hidden email]
> Sent: Tue, 24 Jul 2012 10:45:18 -0700 (PDT)
> To: [hidden email]
> Subject: Re: [R] Simple reshape problem I am completely missing
>
> HI,
>
> I think your group and tps variables have a lot of levels, so the
> combinations which are not there might end up as NA.
>
>
> Try this:
> md4<-data.frame(group=c(rep("X1",3),rep("X2",3)),tps=c("L","R","P","L","R","P"),sum=rnorm(6,15))
>> md4
> group tps sum
> 1 X1 L 13.94542
> 2 X1 R 14.34785
> 3 X1 P 15.31574
> 4 X2 L 14.50404
> 5 X2 R 13.73331
> 6 X2 P 14.69673
> > dd <- dcast(md4, group~tps, value.var="sum")
>> dd
> group L P R
> 1 X1 13.94542 15.31574 14.34785
> 2 X2 14.50404 14.69673 13.73331
>
> A.K.
>
>
>
>
> ----- Original Message -----
> From: John Kane < [hidden email]>
> To: [hidden email]
> Cc:
> Sent: Tuesday, July 24, 2012 1:17 PM
> Subject: [R] Simple reshape problem I am completely missing
>
> I seem to be doing something really stupid or missing something really
> obvious but what?
>
> I have a simple three column data.frame that I would like to reshape to
> wide preferably using reshape2.
>
> An example from
http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix> looked perfect except I wanted a data frame but it seemed okay. I just
> changed acast to dcast and it seems fine.
>
> Stackoverflow example:
> tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
> y=gl(3,1,6, labels=letters[1:3]),
> z=c(1,2,3,3,3,2))
> dd <- dcast(tmp, x~y, value.var="z")
>
> My Example: Does NOT work
>
> md2 <- structure(list(group = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,
> 4L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L), .Label = c("X1",
> "X2", "X3", "X4", "X5", "X6", "X7", "X8"), class = "factor"),
> tps = structure(c(7L, 12L, 14L, 4L, 8L, 9L, 16L, 6L, 7L,
> 11L, 6L, 15L, 10L, 13L, 3L, 4L, 5L, 1L, 2L), .Label = c("A",
> "C", "D", "E", "G", "I", "L", "M", "N", "P", "Q", "R", "S",
> "T", "V", "Y"), class = "factor"), sum = c(0.914913196595112,
> 0.0367565080432513, 0.0483302953616366, 0.982727803634948,
> 0.0172721963650521, 0.0483302953616366, 0.951669704638363,
> 0.89764100023006, 0.0850868034048879, 0.0172721963650521,
> 0.951669704638363, 0.0483302953616366, 0.963243491956749,
> 0.0367565080432513, 0.89764100023006, 0.0540287044083034,
> 0.0483302953616366, 0.982727803634948, 0.0172721963650521
> )), .Names = c("group", "tps", "sum"), row.names = c(NA,
> -19L), class = "data.frame")
>
> dcast(md2, group ~ tps , value.vars = "sum")
>
>
> What am I doing wrong?
>
> John Kane
> Kingston ON Canada
>
> ____________________________________________________________
> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and
> family!
> Visit http://www.inbox.com/photosharing to find out more!
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
____________________________________________________________
FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|