# data frame transformation

5 messages
Open this post in threaded view
|

## data frame transformation

 Hello Everyone, would you be able to assist with some expertise on how to get the following done in a way that can be applied to a data set with different dimensions and without all the line items here? we have: id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ of course in real data set, usually in magnitude of 10000 letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),           sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number of unique "letters" is less than 4000 in real data set and they are no duplicates within same ID weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),           sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is below 50 in real data set and they are no duplicates within same ID data<-data.frame(id=id,letter=letter,weight=weight) #goal is to get the following transformation where a column is added for each unique letter and the weight is pulled into the column if the letter exist within the ID, otherwise NA #so we would get datatransform like below but without the many steps described here datatransfer<-data.frame(data,apply(data[2],2,function(x) ifelse(x=="A",data\$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="B",data\$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="C",data\$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="D",data\$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="E",data\$weight,NA))) colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E") much appreciate the help, thanks Andras  ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: data frame transformation

 Hi! Maybe this would do the trick: --- snip --- library(reshape2) # Use 'reshape2' library(dplyr)    # Use 'dplyr' datatransfer<-data %>% mutate(letter2=letter) %>%   dcast(id+letter~letter2, value.var="weight") --- snip --- Or did I misunderstood something? Best, Kimmo 2019-01-06, 13:16 +0000, Andras Farkas via R-help wrote: > Hello Everyone, > > would you be able to assist with some expertise on how to get the > following done in a way that can be applied to a data set with > different dimensions and without all the line items here? > > we have: > > id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may > differ of course in real data set, usually in magnitude of 10000 > letter<- > c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),s > ample(c("A","B","C","D","E"),2), >           > sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#nu > mber of unique "letters" is less than 4000 in real data set and they > are no duplicates within same ID > weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2), >           sample(c(1:30),4),sample(c(1:30),4))#number of unique > weights is below 50 in real data set and they are no duplicates > within same ID > > > data<-data.frame(id=id,letter=letter,weight=weight) > > #goal is to get the following transformation where a column is added > for each unique letter and the weight is pulled into the column if > the letter exist within the ID, otherwise NA > #so we would get datatransform like below but without the many steps > described here > > datatransfer<-data.frame(data,apply(data[2],2,function(x) > ifelse(x=="A",data\$weight,NA))) > datatransfer<- > data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="B",data\$weight,NA))) > datatransfer<- > data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="C",data\$weight,NA))) > datatransfer<- > data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="D",data\$weight,NA))) > datatransfer<- > data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="E",data\$weight,NA))) > > colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E") > much appreciate the help, > > thanks > > Andras > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: data frame transformation

 In reply to this post by R help mailing list-2 Like this (using base R only)? dat<-data.frame(id=id,letter=letter,weight=weight) # using your data ud <- unique(dat\$id) ul = unique(dat\$letter) d <- with(dat,           data.frame(           letter = rep(ul, e = length(ud)),           id = rep(ud, length(ul))           ) )  merge(dat[,c(2,1,3)],d, all.y = TRUE) ## resulting in:    letter id weight 1       A  1     25 2       A  2     28 3       A  3     14 4       A  4     27 5       A  5     NA 6       B  1     13 7       B  2     14 8       B  3     NA 9       B  4     15 10      B  5      2 11      C  1     NA 12      C  2     NA 13      C  3     NA 14      C  4     NA 15      C  5     25 16      D  1     24 17      D  2     18 18      D  3     NA 19      D  4     29 20      D  5     27 21      E  1     NA 22      E  2      2 23      E  3     20 24      E  4     25 25      E  5     28 Cheers, Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help < [hidden email]> wrote: > Hello Everyone, > > would you be able to assist with some expertise on how to get the > following done in a way that can be applied to a data set with different > dimensions and without all the line items here? > > we have: > > id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ > of course in real data set, usually in magnitude of 10000 > > letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2), > > sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number > of unique "letters" is less than 4000 in real data set and they are no > duplicates within same ID > weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2), >           sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is > below 50 in real data set and they are no duplicates within same ID > > > data<-data.frame(id=id,letter=letter,weight=weight) > > #goal is to get the following transformation where a column is added for > each unique letter and the weight is pulled into the column if the letter > exist within the ID, otherwise NA > #so we would get datatransform like below but without the many steps > described here > > datatransfer<-data.frame(data,apply(data[2],2,function(x) > ifelse(x=="A",data\$weight,NA))) > datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="B",data\$weight,NA))) > datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="C",data\$weight,NA))) > datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="D",data\$weight,NA))) > datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="E",data\$weight,NA))) > > colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E") > much appreciate the help, > > thanks > > Andras > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. >         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.