Simple reshape problem I am completely missing

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Simple reshape problem I am completely missing

John Kane
I seem to be doing something really stupid or missing something really obvious  but what?

I have a simple three column data.frame that I would like to reshape to wide preferably using reshape2.

An example from http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix looked perfect except I wanted a data frame but it seemed okay. I just changed acast to dcast and it seems fine.

Stackoverflow example:
tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
                  y=gl(3,1,6, labels=letters[1:3]),
                  z=c(1,2,3,3,3,2))
dd  <-  dcast(tmp, x~y, value.var="z")

My Example: Does NOT work

md2  <-  structure(list(group = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,
4L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L), .Label = c("X1",
"X2", "X3", "X4", "X5", "X6", "X7", "X8"), class = "factor"),
    tps = structure(c(7L, 12L, 14L, 4L, 8L, 9L, 16L, 6L, 7L,
    11L, 6L, 15L, 10L, 13L, 3L, 4L, 5L, 1L, 2L), .Label = c("A",
    "C", "D", "E", "G", "I", "L", "M", "N", "P", "Q", "R", "S",
    "T", "V", "Y"), class = "factor"), sum = c(0.914913196595112,
    0.0367565080432513, 0.0483302953616366, 0.982727803634948,
    0.0172721963650521, 0.0483302953616366, 0.951669704638363,
    0.89764100023006, 0.0850868034048879, 0.0172721963650521,
    0.951669704638363, 0.0483302953616366, 0.963243491956749,
    0.0367565080432513, 0.89764100023006, 0.0540287044083034,
    0.0483302953616366, 0.982727803634948, 0.0172721963650521
    )), .Names = c("group", "tps", "sum"), row.names = c(NA,
-19L), class = "data.frame")

dcast(md2,  group ~ tps , value.vars  = "sum")


What am I doing wrong?

John Kane
Kingston ON Canada

____________________________________________________________
FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family!
Visit http://www.inbox.com/photosharing to find out more!

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Simple reshape problem I am completely missing

jose Bartolomei

Did you melt first?

library(reshape2)
?melt

> Date: Tue, 24 Jul 2012 09:17:08 -0800
> From: [hidden email]
> To: [hidden email]
> Subject: [R] Simple reshape problem I am completely missing
>
> I seem to be doing something really stupid or missing something really obvious  but what?
>
> I have a simple three column data.frame that I would like to reshape to wide preferably using reshape2.
>
> An example from http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix looked perfect except I wanted a data frame but it seemed okay. I just changed acast to dcast and it seems fine.
>
> Stackoverflow example:
> tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
>                   y=gl(3,1,6, labels=letters[1:3]),
>                   z=c(1,2,3,3,3,2))
> dd  <-  dcast(tmp, x~y, value.var="z")
>
> My Example: Does NOT work
>
> md2  <-  structure(list(group = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,
> 4L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L), .Label = c("X1",
> "X2", "X3", "X4", "X5", "X6", "X7", "X8"), class = "factor"),
>     tps = structure(c(7L, 12L, 14L, 4L, 8L, 9L, 16L, 6L, 7L,
>     11L, 6L, 15L, 10L, 13L, 3L, 4L, 5L, 1L, 2L), .Label = c("A",
>     "C", "D", "E", "G", "I", "L", "M", "N", "P", "Q", "R", "S",
>     "T", "V", "Y"), class = "factor"), sum = c(0.914913196595112,
>     0.0367565080432513, 0.0483302953616366, 0.982727803634948,
>     0.0172721963650521, 0.0483302953616366, 0.951669704638363,
>     0.89764100023006, 0.0850868034048879, 0.0172721963650521,
>     0.951669704638363, 0.0483302953616366, 0.963243491956749,
>     0.0367565080432513, 0.89764100023006, 0.0540287044083034,
>     0.0483302953616366, 0.982727803634948, 0.0172721963650521
>     )), .Names = c("group", "tps", "sum"), row.names = c(NA,
> -19L), class = "data.frame")
>
> dcast(md2,  group ~ tps , value.vars  = "sum")
>
>
> What am I doing wrong?
>
> John Kane
> Kingston ON Canada
>
> ____________________________________________________________
> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family!
> Visit http://www.inbox.com/photosharing to find out more!
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
     
        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Simple reshape problem I am completely missing

arun kirshna
In reply to this post by John Kane
HI,

I think your group and tps variables have a lot of levels, so the combinations which are not there might end up as NA.


Try this:
 md4<-data.frame(group=c(rep("X1",3),rep("X2",3)),tps=c("L","R","P","L","R","P"),sum=rnorm(6,15))
> md4
  group tps      sum
1    X1   L 13.94542
2    X1   R 14.34785
3    X1   P 15.31574
4    X2   L 14.50404
5    X2   R 13.73331
6    X2   P 14.69673
>  dd  <-  dcast(md4, group~tps, value.var="sum")
> dd
  group        L        P        R
1    X1 13.94542 15.31574 14.34785
2    X2 14.50404 14.69673 13.73331

A.K.




----- Original Message -----
From: John Kane <[hidden email]>
To: [hidden email]
Cc:
Sent: Tuesday, July 24, 2012 1:17 PM
Subject: [R] Simple reshape problem I am completely missing

I seem to be doing something really stupid or missing something really obvious  but what?

I have a simple three column data.frame that I would like to reshape to wide preferably using reshape2.

An example from http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix looked perfect except I wanted a data frame but it seemed okay. I just changed acast to dcast and it seems fine.

Stackoverflow example:
tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
                  y=gl(3,1,6, labels=letters[1:3]),
                  z=c(1,2,3,3,3,2))
dd  <-  dcast(tmp, x~y, value.var="z")

My Example: Does NOT work

md2  <-  structure(list(group = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,
4L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L), .Label = c("X1",
"X2", "X3", "X4", "X5", "X6", "X7", "X8"), class = "factor"),
    tps = structure(c(7L, 12L, 14L, 4L, 8L, 9L, 16L, 6L, 7L,
    11L, 6L, 15L, 10L, 13L, 3L, 4L, 5L, 1L, 2L), .Label = c("A",
    "C", "D", "E", "G", "I", "L", "M", "N", "P", "Q", "R", "S",
    "T", "V", "Y"), class = "factor"), sum = c(0.914913196595112,
    0.0367565080432513, 0.0483302953616366, 0.982727803634948,
    0.0172721963650521, 0.0483302953616366, 0.951669704638363,
    0.89764100023006, 0.0850868034048879, 0.0172721963650521,
    0.951669704638363, 0.0483302953616366, 0.963243491956749,
    0.0367565080432513, 0.89764100023006, 0.0540287044083034,
    0.0483302953616366, 0.982727803634948, 0.0172721963650521
    )), .Names = c("group", "tps", "sum"), row.names = c(NA,
-19L), class = "data.frame")

dcast(md2,  group ~ tps , value.vars  = "sum")


What am I doing wrong?

John Kane
Kingston ON Canada

____________________________________________________________
FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family!
Visit http://www.inbox.com/photosharing to find out more!

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Simple reshape problem I am completely missing

Ista Zahn
In reply to this post by jose Bartolomei
The data is already in a "long" form, it doesn't require melting. The
problem was simply that you used the wrong argument name: change
"value.vars" to "value.var", i.e.,

dcast(md2,  group ~ tps , value.var  = "sum")

should work fine.

Best,
Ista

On Tue, Jul 24, 2012 at 1:23 PM, jose Bartolomei <[hidden email]> wrote:

>
> Did you melt first?
>
> library(reshape2)
> ?melt
>
>> Date: Tue, 24 Jul 2012 09:17:08 -0800
>> From: [hidden email]
>> To: [hidden email]
>> Subject: [R] Simple reshape problem I am completely missing
>>
>> I seem to be doing something really stupid or missing something really obvious  but what?
>>
>> I have a simple three column data.frame that I would like to reshape to wide preferably using reshape2.
>>
>> An example from http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix looked perfect except I wanted a data frame but it seemed okay. I just changed acast to dcast and it seems fine.
>>
>> Stackoverflow example:
>> tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
>>                   y=gl(3,1,6, labels=letters[1:3]),
>>                   z=c(1,2,3,3,3,2))
>> dd  <-  dcast(tmp, x~y, value.var="z")
>>
>> My Example: Does NOT work
>>
>> md2  <-  structure(list(group = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,
>> 4L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L), .Label = c("X1",
>> "X2", "X3", "X4", "X5", "X6", "X7", "X8"), class = "factor"),
>>     tps = structure(c(7L, 12L, 14L, 4L, 8L, 9L, 16L, 6L, 7L,
>>     11L, 6L, 15L, 10L, 13L, 3L, 4L, 5L, 1L, 2L), .Label = c("A",
>>     "C", "D", "E", "G", "I", "L", "M", "N", "P", "Q", "R", "S",
>>     "T", "V", "Y"), class = "factor"), sum = c(0.914913196595112,
>>     0.0367565080432513, 0.0483302953616366, 0.982727803634948,
>>     0.0172721963650521, 0.0483302953616366, 0.951669704638363,
>>     0.89764100023006, 0.0850868034048879, 0.0172721963650521,
>>     0.951669704638363, 0.0483302953616366, 0.963243491956749,
>>     0.0367565080432513, 0.89764100023006, 0.0540287044083034,
>>     0.0483302953616366, 0.982727803634948, 0.0172721963650521
>>     )), .Names = c("group", "tps", "sum"), row.names = c(NA,
>> -19L), class = "data.frame")
>>
>> dcast(md2,  group ~ tps , value.vars  = "sum")
>>
>>
>> What am I doing wrong?
>>
>> John Kane
>> Kingston ON Canada
>>
>> ____________________________________________________________
>> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family!
>> Visit http://www.inbox.com/photosharing to find out more!
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Simple reshape problem I am completely missing

arun kirshna


Hello,

I think John was asking about the NAs in the output.  If you use many levels of both factors and doesn't have many combinations, it will end up in getting NAs

#For example,

mdres  <-  dcast(md2, tps~group, value.var="sum")
> mdres
   tps         X1        X2        X3        X4        X5         X6        X7
1    A         NA        NA        NA        NA        NA         NA        NA
2    C         NA        NA        NA        NA        NA         NA        NA
3    D         NA        NA        NA        NA        NA         NA 0.8976410
4    E         NA 0.9827278        NA        NA        NA         NA 0.0540287
5    G         NA        NA        NA        NA        NA         NA 0.0483303
6    I         NA        NA        NA 0.8976410 0.9516697         NA        NA
7    L 0.91491320        NA        NA 0.0850868        NA         NA        NA
8    M         NA 0.0172722        NA        NA        NA         NA        NA
9    N         NA        NA 0.0483303        NA        NA         NA        NA
10   P         NA        NA        NA        NA        NA 0.96324349        NA
11   Q         NA        NA        NA 0.0172722        NA         NA        NA
12   R 0.03675651        NA        NA        NA        NA         NA        NA
13   S         NA        NA        NA        NA        NA 0.03675651        NA
14   T 0.04833030        NA        NA        NA        NA         NA        NA
15   V         NA        NA        NA        NA 0.0483303         NA        NA
16   Y         NA        NA 0.9516697        NA        NA         NA        NA
          X8
1  0.9827278
2  0.0172722
3         NA
4         NA
5         NA
6         NA
7         NA
8         NA
9         NA
10        NA
11        NA
12        NA
13        NA
14        NA
15        NA
16        NA

###Now consider a set similar to the example.

 md4<-data.frame(group=c(rep("X1",3),rep("X2",3)),tps=c("L","R","P","L","R","P"),sum=rnorm(6,15))
> md4
  group tps      sum
1    X1   L 13.94542
2    X1   R 14.34785
3    X1   P 15.31574
4    X2   L 14.50404
5    X2   R 13.73331
6    X2   P 14.69673
>  dd  <-  dcast(md4, group~tps, value.var="sum")
> dd
  group        L        P        R
1    X1 13.94542 15.31574 14.34785
2    X2 14.50404 14.69673 13.73331

A.K.





----- Original Message -----
From: Ista Zahn <[hidden email]>
To: jose Bartolomei <[hidden email]>
Cc: R Help <[hidden email]>
Sent: Tuesday, July 24, 2012 1:56 PM
Subject: Re: [R] Simple reshape problem I am completely missing

The data is already in a "long" form, it doesn't require melting. The
problem was simply that you used the wrong argument name: change
"value.vars" to "value.var", i.e.,

dcast(md2,  group ~ tps , value.var  = "sum")

should work fine.

Best,
Ista

On Tue, Jul 24, 2012 at 1:23 PM, jose Bartolomei <[hidden email]> wrote:

>
> Did you melt first?
>
> library(reshape2)
> ?melt
>
>> Date: Tue, 24 Jul 2012 09:17:08 -0800
>> From: [hidden email]
>> To: [hidden email]
>> Subject: [R] Simple reshape problem I am completely missing
>>
>> I seem to be doing something really stupid or missing something really obvious  but what?
>>
>> I have a simple three column data.frame that I would like to reshape to wide preferably using reshape2.
>>
>> An example from http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix looked perfect except I wanted a data frame but it seemed okay. I just changed acast to dcast and it seems fine.
>>
>> Stackoverflow example:
>> tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
>>                   y=gl(3,1,6, labels=letters[1:3]),
>>                   z=c(1,2,3,3,3,2))
>> dd  <-  dcast(tmp, x~y, value.var="z")
>>
>> My Example: Does NOT work
>>
>> md2  <-  structure(list(group = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,
>> 4L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L), .Label = c("X1",
>> "X2", "X3", "X4", "X5", "X6", "X7", "X8"), class = "factor"),
>>     tps = structure(c(7L, 12L, 14L, 4L, 8L, 9L, 16L, 6L, 7L,
>>     11L, 6L, 15L, 10L, 13L, 3L, 4L, 5L, 1L, 2L), .Label = c("A",
>>     "C", "D", "E", "G", "I", "L", "M", "N", "P", "Q", "R", "S",
>>     "T", "V", "Y"), class = "factor"), sum = c(0.914913196595112,
>>     0.0367565080432513, 0.0483302953616366, 0.982727803634948,
>>     0.0172721963650521, 0.0483302953616366, 0.951669704638363,
>>     0.89764100023006, 0.0850868034048879, 0.0172721963650521,
>>     0.951669704638363, 0.0483302953616366, 0.963243491956749,
>>     0.0367565080432513, 0.89764100023006, 0.0540287044083034,
>>     0.0483302953616366, 0.982727803634948, 0.0172721963650521
>>     )), .Names = c("group", "tps", "sum"), row.names = c(NA,
>> -19L), class = "data.frame")
>>
>> dcast(md2,  group ~ tps , value.vars  = "sum")
>>
>>
>> What am I doing wrong?
>>
>> John Kane
>> Kingston ON Canada
>>
>> ____________________________________________________________
>> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family!
>> Visit http://www.inbox.com/photosharing to find out more!
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Simple reshape problem I am completely missing

John Kane
In reply to this post by arun kirshna
Looks like it. thanks

Now all I have to do is figgure out how to get rid of all the missing combos.  Oh well another dat
thanks

John Kane
Kingston ON Canada


> -----Original Message-----
> From: [hidden email]
> Sent: Tue, 24 Jul 2012 10:45:18 -0700 (PDT)
> To: [hidden email]
> Subject: Re: [R] Simple reshape problem I am completely missing
>
> HI,
>
> I think your group and tps variables have a lot of levels, so the
> combinations which are not there might end up as NA.
>
>
> Try this:
>  md4<-data.frame(group=c(rep("X1",3),rep("X2",3)),tps=c("L","R","P","L","R","P"),sum=rnorm(6,15))
>> md4
>   group tps      sum
> 1    X1   L 13.94542
> 2    X1   R 14.34785
> 3    X1   P 15.31574
> 4    X2   L 14.50404
> 5    X2   R 13.73331
> 6    X2   P 14.69673
> >  dd  <-  dcast(md4, group~tps, value.var="sum")
>> dd
>   group        L        P        R
> 1    X1 13.94542 15.31574 14.34785
> 2    X2 14.50404 14.69673 13.73331
>
> A.K.
>
>
>
>
> ----- Original Message -----
> From: John Kane <[hidden email]>
> To: [hidden email]
> Cc:
> Sent: Tuesday, July 24, 2012 1:17 PM
> Subject: [R] Simple reshape problem I am completely missing
>
> I seem to be doing something really stupid or missing something really
> obvious  but what?
>
> I have a simple three column data.frame that I would like to reshape to
> wide preferably using reshape2.
>
> An example from
http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix

> looked perfect except I wanted a data frame but it seemed okay. I just
> changed acast to dcast and it seems fine.
>
> Stackoverflow example:
> tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
>                   y=gl(3,1,6, labels=letters[1:3]),
>                   z=c(1,2,3,3,3,2))
> dd  <-  dcast(tmp, x~y, value.var="z")
>
> My Example: Does NOT work
>
> md2  <-  structure(list(group = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,
> 4L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L), .Label = c("X1",
> "X2", "X3", "X4", "X5", "X6", "X7", "X8"), class = "factor"),
>     tps = structure(c(7L, 12L, 14L, 4L, 8L, 9L, 16L, 6L, 7L,
>     11L, 6L, 15L, 10L, 13L, 3L, 4L, 5L, 1L, 2L), .Label = c("A",
>     "C", "D", "E", "G", "I", "L", "M", "N", "P", "Q", "R", "S",
>     "T", "V", "Y"), class = "factor"), sum = c(0.914913196595112,
>     0.0367565080432513, 0.0483302953616366, 0.982727803634948,
>     0.0172721963650521, 0.0483302953616366, 0.951669704638363,
>     0.89764100023006, 0.0850868034048879, 0.0172721963650521,
>     0.951669704638363, 0.0483302953616366, 0.963243491956749,
>     0.0367565080432513, 0.89764100023006, 0.0540287044083034,
>     0.0483302953616366, 0.982727803634948, 0.0172721963650521
>     )), .Names = c("group", "tps", "sum"), row.names = c(NA,
> -19L), class = "data.frame")
>
> dcast(md2,  group ~ tps , value.vars  = "sum")
>
>
> What am I doing wrong?
>
> John Kane
> Kingston ON Canada
>
> ____________________________________________________________
> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and
> family!
> Visit http://www.inbox.com/photosharing to find out more!
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

____________________________________________________________
FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.