data frame transformation

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

data frame transformation

R help mailing list-2
Hello Everyone,

would you be able to assist with some expertise on how to get the following done in a way that can be applied to a data set with different dimensions and without all the line items here?

we have:

id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ of course in real data set, usually in magnitude of 10000
letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
          sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number of unique "letters" is less than 4000 in real data set and they are no duplicates within same ID
weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
          sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is below 50 in real data set and they are no duplicates within same ID


data<-data.frame(id=id,letter=letter,weight=weight)

#goal is to get the following transformation where a column is added for each unique letter and the weight is pulled into the column if the letter exist within the ID, otherwise NA
#so we would get datatransform like below but without the many steps described here

datatransfer<-data.frame(data,apply(data[2],2,function(x) ifelse(x=="A",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="B",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="C",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="D",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="E",data$weight,NA)))

colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
much appreciate the help,

thanks

Andras 

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: data frame transformation

K. Elo-2
Hi!

Maybe this would do the trick:

--- snip ---

library(reshape2) # Use 'reshape2'
library(dplyr)    # Use 'dplyr'

datatransfer<-data %>% mutate(letter2=letter) %>%
  dcast(id+letter~letter2, value.var="weight")

--- snip ---

Or did I misunderstood something?

Best,

Kimmo

2019-01-06, 13:16 +0000, Andras Farkas via R-help wrote:

> Hello Everyone,
>
> would you be able to assist with some expertise on how to get the
> following done in a way that can be applied to a data set with
> different dimensions and without all the line items here?
>
> we have:
>
> id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may
> differ of course in real data set, usually in magnitude of 10000
> letter<-
> c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),s
> ample(c("A","B","C","D","E"),2),
>          
> sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#nu
> mber of unique "letters" is less than 4000 in real data set and they
> are no duplicates within same ID
> weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
>           sample(c(1:30),4),sample(c(1:30),4))#number of unique
> weights is below 50 in real data set and they are no duplicates
> within same ID
>
>
> data<-data.frame(id=id,letter=letter,weight=weight)
>
> #goal is to get the following transformation where a column is added
> for each unique letter and the weight is pulled into the column if
> the letter exist within the ID, otherwise NA
> #so we would get datatransform like below but without the many steps
> described here
>
> datatransfer<-data.frame(data,apply(data[2],2,function(x)
> ifelse(x=="A",data$weight,NA)))
> datatransfer<-
> data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="B",data$weight,NA)))
> datatransfer<-
> data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="C",data$weight,NA)))
> datatransfer<-
> data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="D",data$weight,NA)))
> datatransfer<-
> data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="E",data$weight,NA)))
>
> colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
> much appreciate the help,
>
> thanks
>
> Andras
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: data frame transformation

Bert Gunter-2
In reply to this post by R help mailing list-2
Like this (using base R only)?

dat<-data.frame(id=id,letter=letter,weight=weight) # using your data

ud <- unique(dat$id)
ul = unique(dat$letter)
d <- with(dat,
          data.frame(
          letter = rep(ul, e = length(ud)),
          id = rep(ud, length(ul))
          ) )

 merge(dat[,c(2,1,3)],d, all.y = TRUE)
## resulting in:

   letter id weight
1       A  1     25
2       A  2     28
3       A  3     14
4       A  4     27
5       A  5     NA
6       B  1     13
7       B  2     14
8       B  3     NA
9       B  4     15
10      B  5      2
11      C  1     NA
12      C  2     NA
13      C  3     NA
14      C  4     NA
15      C  5     25
16      D  1     24
17      D  2     18
18      D  3     NA
19      D  4     29
20      D  5     27
21      E  1     NA
22      E  2      2
23      E  3     20
24      E  4     25
25      E  5     28


Cheers,

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help <
[hidden email]> wrote:

> Hello Everyone,
>
> would you be able to assist with some expertise on how to get the
> following done in a way that can be applied to a data set with different
> dimensions and without all the line items here?
>
> we have:
>
> id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ
> of course in real data set, usually in magnitude of 10000
>
> letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
>
> sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number
> of unique "letters" is less than 4000 in real data set and they are no
> duplicates within same ID
> weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
>           sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is
> below 50 in real data set and they are no duplicates within same ID
>
>
> data<-data.frame(id=id,letter=letter,weight=weight)
>
> #goal is to get the following transformation where a column is added for
> each unique letter and the weight is pulled into the column if the letter
> exist within the ID, otherwise NA
> #so we would get datatransform like below but without the many steps
> described here
>
> datatransfer<-data.frame(data,apply(data[2],2,function(x)
> ifelse(x=="A",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="B",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="C",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="D",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="E",data$weight,NA)))
>
> colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
> much appreciate the help,
>
> thanks
>
> Andras
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: data frame transformation

Bert Gunter-2
In reply to this post by R help mailing list-2
... and my reordering of column indices was unnecessary:
    merge(dat, d, all.y = TRUE)
will do.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help <
[hidden email]> wrote:

> Hello Everyone,
>
> would you be able to assist with some expertise on how to get the
> following done in a way that can be applied to a data set with different
> dimensions and without all the line items here?
>
> we have:
>
> id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ
> of course in real data set, usually in magnitude of 10000
>
> letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
>
> sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number
> of unique "letters" is less than 4000 in real data set and they are no
> duplicates within same ID
> weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
>           sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is
> below 50 in real data set and they are no duplicates within same ID
>
>
> data<-data.frame(id=id,letter=letter,weight=weight)
>
> #goal is to get the following transformation where a column is added for
> each unique letter and the weight is pulled into the column if the letter
> exist within the ID, otherwise NA
> #so we would get datatransform like below but without the many steps
> described here
>
> datatransfer<-data.frame(data,apply(data[2],2,function(x)
> ifelse(x=="A",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="B",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="C",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="D",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="E",data$weight,NA)))
>
> colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
> much appreciate the help,
>
> thanks
>
> Andras
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: data frame transformation

R help mailing list-2
Thanks Bert this will do...
Andras

Sent from Yahoo Mail on Android
 
  On Sun, Jan 6, 2019 at 1:09 PM, Bert Gunter<[hidden email]> wrote:   ... and my reordering of column indices was unnecessary:    merge(dat, d, all.y = TRUE)will do.
Bert Gunter

"The trouble with having an open mind is that people keep coming along and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help <[hidden email]> wrote:

Hello Everyone,

would you be able to assist with some expertise on how to get the following done in a way that can be applied to a data set with different dimensions and without all the line items here?

we have:

id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ of course in real data set, usually in magnitude of 10000
letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
          sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number of unique "letters" is less than 4000 in real data set and they are no duplicates within same ID
weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
          sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is below 50 in real data set and they are no duplicates within same ID


data<-data.frame(id=id,letter=letter,weight=weight)

#goal is to get the following transformation where a column is added for each unique letter and the weight is pulled into the column if the letter exist within the ID, otherwise NA
#so we would get datatransform like below but without the many steps described here

datatransfer<-data.frame(data,apply(data[2],2,function(x) ifelse(x=="A",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="B",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="C",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="D",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="E",data$weight,NA)))

colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
much appreciate the help,

thanks

Andras 

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

 

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.