Purr and Basic Functional Programming Tasks

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Purr and Basic Functional Programming Tasks

Lorenzo Isella
Dear All,
I am making my baby steps with the tidyverse purr package and I am
stuck with some probably trivial tasks.
Consider the following data set


zz<-list(structure(list(year = c(2000, 2001, 2002, 2003, 2000, 2001,
2002, 2003, 2000, 2001, 2002, 2003), tot_i = c(22393349.081,
23000574.372, 21682040.898, 21671102.853, 34361300.338, 35297814.942,
34745691.204, 35878883.117, 11967951.257, 12297240.57, 13063650.306,
14207780.264), relation = c("EU28-Algeria", "EU28-Algeria", "EU28-Algeria",
"EU28-Algeria", "World-Algeria", "World-Algeria", "World-Algeria",
"World-Algeria", "Extra EU28-Algeria", "Extra EU28-Algeria",
"Extra EU28-Algeria", "Extra EU28-Algeria"), g_rate = c(0.736046372770467,
0.0271163231905857, -0.0573261107603093, -0.000504474880914325,
0.614846575418334, 0.0272549232650638, -0.0156418673197543,    0.0326138831530727,
0.428272657063707, 0.0275142592018328, 0.0623237165799383, 0.0875811837579971
)), row.names = c(NA, -12L), class = c("tbl_df", "tbl", "data.frame"
)), structure(list(year = c(2000, 2001, 2002, 2003, 2000, 2001,
2002, 2003, 2000, 2001, 2002, 2003), tot_i = c(9233346.648, 7869288.171,
7271485.687, 6395999.102, 21393949.287, 19851236.26, 19449339.887,
16055014.309, 12160602.639, 11981948.089, 12177854.2, 9659015.207
), relation = c("EU28-Egypt", "EU28-Egypt", "EU28-Egypt", "EU28-Egypt",
"World-Egypt", "World-Egypt", "World-Egypt", "World-Egypt", "Extra EU28-Egypt",
"Extra EU28-Egypt", "Extra EU28-Egypt", "Extra EU28-Egypt"),
g_rate = c(0.0970653722744164, -0.147731751985664, -0.0759665259436081,
-0.120399959882366, 0.124744629514854, -0.0721097823643728,
-0.0202454077789513, -0.174521376957825, 0.146712116047648,
-0.0146912579338002, 0.0163501051368976, -0.206837670383671
)), row.names = c(NA, -12L), class = c("tbl_df", "tbl", "data.frame"
)))

I am capable of doing very simple stuff with maps for instance taking the iteratively the mean of a certain column

map(zz, function(x) mean(x$tot_i))

or filtering the values of the years

map(zz, function(x) filter(x, year==2000))

however, I bang my head against the wall as soon as I want to add a bit of complexity. For instance

1)    I want to iteratively group the data in zz by relation and summarise them by taking the average of tot_i and

2)    Given a list of years

    ll<-list(c(2000, 2001), c(2001, 2003))

I would like to filter the two elements of the zz list according to the years listed in ll.

I would then have plenty of other operations to carry out on the data, but already understanding 1 and 2 would take me a long way from where I am stuck now.

Any suggestion is welcome.
Cheers

Lorenzo

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Purr and Basic Functional Programming Tasks

jholtman
Does this answer the first question?

> rel <- map(zz, function(x){
+   group_by(x, relation) %>% summarise(tot = mean(tot_i))
+ })
> rel
[[1]]
# A tibble: 3 x 2
  relation                 tot
  <chr>                  <dbl>
1 EU28-Algeria       22186767.
2 Extra EU28-Algeria 12884156.
3 World-Algeria      35070922.

[[2]]
# A tibble: 3 x 2
  relation               tot
  <chr>                <dbl>
1 EU28-Egypt        7692530.
2 Extra EU28-Egypt 11494855.
3 World-Egypt      19187385.

>

Jim Holtman
*Data Munger Guru*


*What is the problem that you are trying to solve?Tell me what you want to
do, not how you want to do it.*


On Fri, Jan 25, 2019 at 5:45 AM Lorenzo Isella <[hidden email]>
wrote:

> Dear All,
> I am making my baby steps with the tidyverse purr package and I am
> stuck with some probably trivial tasks.
> Consider the following data set
>
>
> zz<-list(structure(list(year = c(2000, 2001, 2002, 2003, 2000, 2001,
> 2002, 2003, 2000, 2001, 2002, 2003), tot_i = c(22393349.081,
> 23000574.372, 21682040.898, 21671102.853, 34361300.338, 35297814.942,
> 34745691.204, 35878883.117, 11967951.257, 12297240.57, 13063650.306,
> 14207780.264), relation = c("EU28-Algeria", "EU28-Algeria",
> "EU28-Algeria",
> "EU28-Algeria", "World-Algeria", "World-Algeria", "World-Algeria",
> "World-Algeria", "Extra EU28-Algeria", "Extra EU28-Algeria",
> "Extra EU28-Algeria", "Extra EU28-Algeria"), g_rate = c(0.736046372770467,
> 0.0271163231905857, -0.0573261107603093, -0.000504474880914325,
> 0.614846575418334, 0.0272549232650638, -0.0156418673197543,
> 0.0326138831530727,
> 0.428272657063707, 0.0275142592018328, 0.0623237165799383,
> 0.0875811837579971
> )), row.names = c(NA, -12L), class = c("tbl_df", "tbl", "data.frame"
> )), structure(list(year = c(2000, 2001, 2002, 2003, 2000, 2001,
> 2002, 2003, 2000, 2001, 2002, 2003), tot_i = c(9233346.648, 7869288.171,
> 7271485.687, 6395999.102, 21393949.287, 19851236.26, 19449339.887,
> 16055014.309, 12160602.639, 11981948.089, 12177854.2, 9659015.207
> ), relation = c("EU28-Egypt", "EU28-Egypt", "EU28-Egypt", "EU28-Egypt",
> "World-Egypt", "World-Egypt", "World-Egypt", "World-Egypt", "Extra
> EU28-Egypt",
> "Extra EU28-Egypt", "Extra EU28-Egypt", "Extra EU28-Egypt"),
> g_rate = c(0.0970653722744164, -0.147731751985664, -0.0759665259436081,
> -0.120399959882366, 0.124744629514854, -0.0721097823643728,
> -0.0202454077789513, -0.174521376957825, 0.146712116047648,
> -0.0146912579338002, 0.0163501051368976, -0.206837670383671
> )), row.names = c(NA, -12L), class = c("tbl_df", "tbl", "data.frame"
> )))
>
> I am capable of doing very simple stuff with maps for instance taking the
> iteratively the mean of a certain column
>
> map(zz, function(x) mean(x$tot_i))
>
> or filtering the values of the years
>
> map(zz, function(x) filter(x, year==2000))
>
> however, I bang my head against the wall as soon as I want to add a bit of
> complexity. For instance
>
> 1)    I want to iteratively group the data in zz by relation and summarise
> them by taking the average of tot_i and
>
> 2)    Given a list of years
>
>     ll<-list(c(2000, 2001), c(2001, 2003))
>
> I would like to filter the two elements of the zz list according to the
> years listed in ll.
>
> I would then have plenty of other operations to carry out on the data, but
> already understanding 1 and 2 would take me a long way from where I am
> stuck now.
>
> Any suggestion is welcome.
> Cheers
>
> Lorenzo
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Purr and Basic Functional Programming Tasks

jholtman
In reply to this post by Lorenzo Isella
Try this for the second question:

> years <- map2(zz,
+               list(c(2000, 2001), c(2001, 2003)),
+               ~ filter(.x, year %in% .y)
+ )
> years
[[1]]
# A tibble: 6 x 4
   year     tot_i relation           g_rate
  <dbl>     <dbl> <chr>               <dbl>
1  2000 22393349. EU28-Algeria       0.736
2  2001 23000574. EU28-Algeria       0.0271
3  2000 34361300. World-Algeria      0.615
4  2001 35297815. World-Algeria      0.0273
5  2000 11967951. Extra EU28-Algeria 0.428
6  2001 12297241. Extra EU28-Algeria 0.0275

[[2]]
# A tibble: 6 x 4
   year     tot_i relation          g_rate
  <dbl>     <dbl> <chr>              <dbl>
1  2001  7869288. EU28-Egypt       -0.148
2  2003  6395999. EU28-Egypt       -0.120
3  2001 19851236. World-Egypt      -0.0721
4  2003 16055014. World-Egypt      -0.175
5  2001 11981948. Extra EU28-Egypt -0.0147
6  2003  9659015. Extra EU28-Egypt -0.207

>

Jim Holtman
*Data Munger Guru*


*What is the problem that you are trying to solve?Tell me what you want to
do, not how you want to do it.*


On Fri, Jan 25, 2019 at 5:45 AM Lorenzo Isella <[hidden email]>
wrote:

> Dear All,
> I am making my baby steps with the tidyverse purr package and I am
> stuck with some probably trivial tasks.
> Consider the following data set
>
>
> zz<-list(structure(list(year = c(2000, 2001, 2002, 2003, 2000, 2001,
> 2002, 2003, 2000, 2001, 2002, 2003), tot_i = c(22393349.081,
> 23000574.372, 21682040.898, 21671102.853, 34361300.338, 35297814.942,
> 34745691.204, 35878883.117, 11967951.257, 12297240.57, 13063650.306,
> 14207780.264), relation = c("EU28-Algeria", "EU28-Algeria",
> "EU28-Algeria",
> "EU28-Algeria", "World-Algeria", "World-Algeria", "World-Algeria",
> "World-Algeria", "Extra EU28-Algeria", "Extra EU28-Algeria",
> "Extra EU28-Algeria", "Extra EU28-Algeria"), g_rate = c(0.736046372770467,
> 0.0271163231905857, -0.0573261107603093, -0.000504474880914325,
> 0.614846575418334, 0.0272549232650638, -0.0156418673197543,
> 0.0326138831530727,
> 0.428272657063707, 0.0275142592018328, 0.0623237165799383,
> 0.0875811837579971
> )), row.names = c(NA, -12L), class = c("tbl_df", "tbl", "data.frame"
> )), structure(list(year = c(2000, 2001, 2002, 2003, 2000, 2001,
> 2002, 2003, 2000, 2001, 2002, 2003), tot_i = c(9233346.648, 7869288.171,
> 7271485.687, 6395999.102, 21393949.287, 19851236.26, 19449339.887,
> 16055014.309, 12160602.639, 11981948.089, 12177854.2, 9659015.207
> ), relation = c("EU28-Egypt", "EU28-Egypt", "EU28-Egypt", "EU28-Egypt",
> "World-Egypt", "World-Egypt", "World-Egypt", "World-Egypt", "Extra
> EU28-Egypt",
> "Extra EU28-Egypt", "Extra EU28-Egypt", "Extra EU28-Egypt"),
> g_rate = c(0.0970653722744164, -0.147731751985664, -0.0759665259436081,
> -0.120399959882366, 0.124744629514854, -0.0721097823643728,
> -0.0202454077789513, -0.174521376957825, 0.146712116047648,
> -0.0146912579338002, 0.0163501051368976, -0.206837670383671
> )), row.names = c(NA, -12L), class = c("tbl_df", "tbl", "data.frame"
> )))
>
> I am capable of doing very simple stuff with maps for instance taking the
> iteratively the mean of a certain column
>
> map(zz, function(x) mean(x$tot_i))
>
> or filtering the values of the years
>
> map(zz, function(x) filter(x, year==2000))
>
> however, I bang my head against the wall as soon as I want to add a bit of
> complexity. For instance
>
> 1)    I want to iteratively group the data in zz by relation and summarise
> them by taking the average of tot_i and
>
> 2)    Given a list of years
>
>     ll<-list(c(2000, 2001), c(2001, 2003))
>
> I would like to filter the two elements of the zz list according to the
> years listed in ll.
>
> I would then have plenty of other operations to carry out on the data, but
> already understanding 1 and 2 would take me a long way from where I am
> stuck now.
>
> Any suggestion is welcome.
> Cheers
>
> Lorenzo
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Purr and Basic Functional Programming Tasks

Lorenzo Isella
Dear Jim,
Thanks a lot for your stellar replies!
They address my questions perfectly.
Cheers

Lorenzo


On Fri, Jan 25, 2019 at 07:46:50AM -0800, jim holtman wrote:

>Try this for the second question:
>
>> years <- map2(zz,
>+               list(c(2000, 2001), c(2001, 2003)),
>+               ~ filter(.x, year %in% .y)
>+ )
>> years
>[[1]]
># A tibble: 6 x 4
>   year     tot_i relation           g_rate
>  <dbl>     <dbl> <chr>               <dbl>
>1  2000 22393349. EU28-Algeria       0.736
>2  2001 23000574. EU28-Algeria       0.0271
>3  2000 34361300. World-Algeria      0.615
>4  2001 35297815. World-Algeria      0.0273
>5  2000 11967951. Extra EU28-Algeria 0.428
>6  2001 12297241. Extra EU28-Algeria 0.0275
>
>[[2]]
># A tibble: 6 x 4
>   year     tot_i relation          g_rate
>  <dbl>     <dbl> <chr>              <dbl>
>1  2001  7869288. EU28-Egypt       -0.148
>2  2003  6395999. EU28-Egypt       -0.120
>3  2001 19851236. World-Egypt      -0.0721
>4  2003 16055014. World-Egypt      -0.175
>5  2001 11981948. Extra EU28-Egypt -0.0147
>6  2003  9659015. Extra EU28-Egypt -0.207
>
>>
>
>Jim Holtman
>*Data Munger Guru*
>
>
>*What is the problem that you are trying to solve?Tell me what you want to
>do, not how you want to do it.*
>
>
>On Fri, Jan 25, 2019 at 5:45 AM Lorenzo Isella <[hidden email]>
>wrote:
>
>> Dear All,
>> I am making my baby steps with the tidyverse purr package and I am
>> stuck with some probably trivial tasks.
>> Consider the following data set
>>
>>
>> zz<-list(structure(list(year = c(2000, 2001, 2002, 2003, 2000, 2001,
>> 2002, 2003, 2000, 2001, 2002, 2003), tot_i = c(22393349.081,
>> 23000574.372, 21682040.898, 21671102.853, 34361300.338, 35297814.942,
>> 34745691.204, 35878883.117, 11967951.257, 12297240.57, 13063650.306,
>> 14207780.264), relation = c("EU28-Algeria", "EU28-Algeria",
>> "EU28-Algeria",
>> "EU28-Algeria", "World-Algeria", "World-Algeria", "World-Algeria",
>> "World-Algeria", "Extra EU28-Algeria", "Extra EU28-Algeria",
>> "Extra EU28-Algeria", "Extra EU28-Algeria"), g_rate = c(0.736046372770467,
>> 0.0271163231905857, -0.0573261107603093, -0.000504474880914325,
>> 0.614846575418334, 0.0272549232650638, -0.0156418673197543,
>> 0.0326138831530727,
>> 0.428272657063707, 0.0275142592018328, 0.0623237165799383,
>> 0.0875811837579971
>> )), row.names = c(NA, -12L), class = c("tbl_df", "tbl", "data.frame"
>> )), structure(list(year = c(2000, 2001, 2002, 2003, 2000, 2001,
>> 2002, 2003, 2000, 2001, 2002, 2003), tot_i = c(9233346.648, 7869288.171,
>> 7271485.687, 6395999.102, 21393949.287, 19851236.26, 19449339.887,
>> 16055014.309, 12160602.639, 11981948.089, 12177854.2, 9659015.207
>> ), relation = c("EU28-Egypt", "EU28-Egypt", "EU28-Egypt", "EU28-Egypt",
>> "World-Egypt", "World-Egypt", "World-Egypt", "World-Egypt", "Extra
>> EU28-Egypt",
>> "Extra EU28-Egypt", "Extra EU28-Egypt", "Extra EU28-Egypt"),
>> g_rate = c(0.0970653722744164, -0.147731751985664, -0.0759665259436081,
>> -0.120399959882366, 0.124744629514854, -0.0721097823643728,
>> -0.0202454077789513, -0.174521376957825, 0.146712116047648,
>> -0.0146912579338002, 0.0163501051368976, -0.206837670383671
>> )), row.names = c(NA, -12L), class = c("tbl_df", "tbl", "data.frame"
>> )))
>>
>> I am capable of doing very simple stuff with maps for instance taking the
>> iteratively the mean of a certain column
>>
>> map(zz, function(x) mean(x$tot_i))
>>
>> or filtering the values of the years
>>
>> map(zz, function(x) filter(x, year==2000))
>>
>> however, I bang my head against the wall as soon as I want to add a bit of
>> complexity. For instance
>>
>> 1)    I want to iteratively group the data in zz by relation and summarise
>> them by taking the average of tot_i and
>>
>> 2)    Given a list of years
>>
>>     ll<-list(c(2000, 2001), c(2001, 2003))
>>
>> I would like to filter the two elements of the zz list according to the
>> years listed in ll.
>>
>> I would then have plenty of other operations to carry out on the data, but
>> already understanding 1 and 2 would take me a long way from where I am
>> stuck now.
>>
>> Any suggestion is welcome.
>> Cheers
>>
>> Lorenzo
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.