Hi,
Hi,

I have a grouped data set and would like to calculate weighted proportions for a large number of factor variables within each group member. Rather than using dplyr::count() on each of these factors individually, the idea would be to do it for all factors at once. Does anyone know how this would work? Here is a reproducible example: ############################################################ # reproducible example df1 <- data.frame(wt=rnorm(90), group=paste0('reg', 1:5), var1=rep(c('male','female'), times=45), var2=rep(c('low','med','high'), each=30)) %>% tbl_df() # instead of doing this separately for each factor ... df2 <- df1 %>% group_by(group) %>% dplyr::count(var1, wt=wt) %>% mutate(prop1=n/sum(n)) df3 <- df1 %>% group_by(group) %>% dplyr::count(var2, wt=wt) %>% mutate(prop2=n/sum(n)) %>% left_join(df2, by='group') # I would like to do something like the following (which does of course not work): my_fun <- function(x,wt){ freq1 <- dplyr::count(x, wt=wt) prop1 <- freq1 / sum(freq1) return(prop) } df1 %>% group_by(group) %>% summarise_all(.funs=my_fun(.), .vars=c('var1', 'var2')) ############################################################ Best regards, Erich
> On Mar 22, 2018, at 3:34 PM, Striessnig, Erich wrote:
>
> Hi,
>
> I have a grouped data set and would like to calculate weighted proportions for a large number of factor variables within each group member. Rather than using dplyr::count() on each of these factors individually, the idea would be to do it for all factors at once. Does anyone know how this would work? Here is a reproducible example:
>
> ############################################################
> # reproducible example
> df1 <- data.frame(wt=rnorm(90),
> group=paste0('reg', 1:5),
> var1=rep(c('male','female'), times=45),
> var2=rep(c('low','med','high'), each=30)) %>% tbl_df()
>
> # instead of doing this separately for each factor ...
> df2 <- df1 %>%
> group_by(group) %>%
> dplyr::count(var1, wt=wt) %>%
> mutate(prop1=n/sum(n))
>
> df3 <- df1 %>%
> group_by(group) %>%
> dplyr::count(var2, wt=wt) %>%
> mutate(prop2=n/sum(n)) %>%
> left_join(df2, by='group')
>
> # I would like to do something like the following (which does of course not work):
> my_fun <- function(x,wt){
> freq1 <- dplyr::count(x, wt=wt)
> prop1 <- freq1 / sum(freq1)
> return(prop)
> }
>
> df1 %>%
> group_by(group) %>%
> summarise_all(.funs=my_fun(.), .vars=c('var1', 'var2'))
> ############################################################

You might find useful functions in the 'freqweights' package. It appears from its description that it was design to fit into the tidyverse paradigm.

I think the survey package might also be useful, but it is not particularly designed for use with tibbles and `%>%`. Might work. Might not.

HTH;
David

>
> Best regards,
> Erich
