

Hello Everyone,
would you be able to assist with some expertise on how to get the following done in a way that can be applied to a data set with different dimensions and without all the line items here?
we have:
id<c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ of course in real data set, usually in magnitude of 10000
letter<c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number of unique "letters" is less than 4000 in real data set and they are no duplicates within same ID
weight<c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is below 50 in real data set and they are no duplicates within same ID
data<data.frame(id=id,letter=letter,weight=weight)
#goal is to get the following transformation where a column is added for each unique letter and the weight is pulled into the column if the letter exist within the ID, otherwise NA
#so we would get datatransform like below but without the many steps described here
datatransfer<data.frame(data,apply(data[2],2,function(x) ifelse(x=="A",data$weight,NA)))
datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="B",data$weight,NA)))
datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="C",data$weight,NA)))
datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="D",data$weight,NA)))
datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="E",data$weight,NA)))
colnames(datatransfer)<c("id","weight","letter","A","B","C","D","E")
much appreciate the help,
thanks
Andras
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hi!
Maybe this would do the trick:
 snip 
library(reshape2) # Use 'reshape2'
library(dplyr) # Use 'dplyr'
datatransfer<data %>% mutate(letter2=letter) %>%
dcast(id+letter~letter2, value.var="weight")
 snip 
Or did I misunderstood something?
Best,
Kimmo
20190106, 13:16 +0000, Andras Farkas via Rhelp wrote:
> Hello Everyone,
>
> would you be able to assist with some expertise on how to get the
> following done in a way that can be applied to a data set with
> different dimensions and without all the line items here?
>
> we have:
>
> id<c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may
> differ of course in real data set, usually in magnitude of 10000
> letter<
> c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),s
> ample(c("A","B","C","D","E"),2),
>
> sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#nu
> mber of unique "letters" is less than 4000 in real data set and they
> are no duplicates within same ID
> weight<c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
> sample(c(1:30),4),sample(c(1:30),4))#number of unique
> weights is below 50 in real data set and they are no duplicates
> within same ID
>
>
> data<data.frame(id=id,letter=letter,weight=weight)
>
> #goal is to get the following transformation where a column is added
> for each unique letter and the weight is pulled into the column if
> the letter exist within the ID, otherwise NA
> #so we would get datatransform like below but without the many steps
> described here
>
> datatransfer<data.frame(data,apply(data[2],2,function(x)
> ifelse(x=="A",data$weight,NA)))
> datatransfer<
> data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="B",data$weight,NA)))
> datatransfer<
> data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="C",data$weight,NA)))
> datatransfer<
> data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="D",data$weight,NA)))
> datatransfer<
> data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="E",data$weight,NA)))
>
> colnames(datatransfer)<c("id","weight","letter","A","B","C","D","E")
> much appreciate the help,
>
> thanks
>
> Andras
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


In reply to this post by R help mailing list2
Like this (using base R only)?
dat<data.frame(id=id,letter=letter,weight=weight) # using your data
ud < unique(dat$id)
ul = unique(dat$letter)
d < with(dat,
data.frame(
letter = rep(ul, e = length(ud)),
id = rep(ud, length(ul))
) )
merge(dat[,c(2,1,3)],d, all.y = TRUE)
## resulting in:
letter id weight
1 A 1 25
2 A 2 28
3 A 3 14
4 A 4 27
5 A 5 NA
6 B 1 13
7 B 2 14
8 B 3 NA
9 B 4 15
10 B 5 2
11 C 1 NA
12 C 2 NA
13 C 3 NA
14 C 4 NA
15 C 5 25
16 D 1 24
17 D 2 18
18 D 3 NA
19 D 4 29
20 D 5 27
21 E 1 NA
22 E 2 2
23 E 3 20
24 E 4 25
25 E 5 28
Cheers,
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
 Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via Rhelp <
[hidden email]> wrote:
> Hello Everyone,
>
> would you be able to assist with some expertise on how to get the
> following done in a way that can be applied to a data set with different
> dimensions and without all the line items here?
>
> we have:
>
> id<c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ
> of course in real data set, usually in magnitude of 10000
>
> letter<c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
>
> sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number
> of unique "letters" is less than 4000 in real data set and they are no
> duplicates within same ID
> weight<c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
> sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is
> below 50 in real data set and they are no duplicates within same ID
>
>
> data<data.frame(id=id,letter=letter,weight=weight)
>
> #goal is to get the following transformation where a column is added for
> each unique letter and the weight is pulled into the column if the letter
> exist within the ID, otherwise NA
> #so we would get datatransform like below but without the many steps
> described here
>
> datatransfer<data.frame(data,apply(data[2],2,function(x)
> ifelse(x=="A",data$weight,NA)))
> datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="B",data$weight,NA)))
> datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="C",data$weight,NA)))
> datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="D",data$weight,NA)))
> datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="E",data$weight,NA)))
>
> colnames(datatransfer)<c("id","weight","letter","A","B","C","D","E")
> much appreciate the help,
>
> thanks
>
> Andras
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


In reply to this post by R help mailing list2
... and my reordering of column indices was unnecessary:
merge(dat, d, all.y = TRUE)
will do.
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
 Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via Rhelp <
[hidden email]> wrote:
> Hello Everyone,
>
> would you be able to assist with some expertise on how to get the
> following done in a way that can be applied to a data set with different
> dimensions and without all the line items here?
>
> we have:
>
> id<c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ
> of course in real data set, usually in magnitude of 10000
>
> letter<c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
>
> sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number
> of unique "letters" is less than 4000 in real data set and they are no
> duplicates within same ID
> weight<c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
> sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is
> below 50 in real data set and they are no duplicates within same ID
>
>
> data<data.frame(id=id,letter=letter,weight=weight)
>
> #goal is to get the following transformation where a column is added for
> each unique letter and the weight is pulled into the column if the letter
> exist within the ID, otherwise NA
> #so we would get datatransform like below but without the many steps
> described here
>
> datatransfer<data.frame(data,apply(data[2],2,function(x)
> ifelse(x=="A",data$weight,NA)))
> datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="B",data$weight,NA)))
> datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="C",data$weight,NA)))
> datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="D",data$weight,NA)))
> datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="E",data$weight,NA)))
>
> colnames(datatransfer)<c("id","weight","letter","A","B","C","D","E")
> much appreciate the help,
>
> thanks
>
> Andras
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Thanks Bert this will do...
Andras
Sent from Yahoo Mail on Android
On Sun, Jan 6, 2019 at 1:09 PM, Bert Gunter< [hidden email]> wrote: ... and my reordering of column indices was unnecessary: merge(dat, d, all.y = TRUE)will do.
Bert Gunter
"The trouble with having an open mind is that people keep coming along and sticking things into it."
 Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via Rhelp < [hidden email]> wrote:
Hello Everyone,
would you be able to assist with some expertise on how to get the following done in a way that can be applied to a data set with different dimensions and without all the line items here?
we have:
id<c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ of course in real data set, usually in magnitude of 10000
letter<c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number of unique "letters" is less than 4000 in real data set and they are no duplicates within same ID
weight<c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is below 50 in real data set and they are no duplicates within same ID
data<data.frame(id=id,letter=letter,weight=weight)
#goal is to get the following transformation where a column is added for each unique letter and the weight is pulled into the column if the letter exist within the ID, otherwise NA
#so we would get datatransform like below but without the many steps described here
datatransfer<data.frame(data,apply(data[2],2,function(x) ifelse(x=="A",data$weight,NA)))
datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="B",data$weight,NA)))
datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="C",data$weight,NA)))
datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="D",data$weight,NA)))
datatransfer<data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="E",data$weight,NA)))
colnames(datatransfer)<c("id","weight","letter","A","B","C","D","E")
much appreciate the help,
thanks
Andras
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

