Average distance in kilometers between subsets of points with ggmap /geosphere

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Average distance in kilometers between subsets of points with ggmap /geosphere

Malte Hückstädt
I would like to determine the geographical distances from a number of addresses and determine the mean value (the mean distance) from these.

In case the dataframe has only one row, I have found a solution:

```r
# Pakete laden
library(readxl)
library(openxlsx)
library(googleway)
#library(sf)
library(tidyverse)
library(geosphere)
library("ggmap")

#API Key bestimmen
set_key("")
api_key <- ""
register_google(key=api_key)

#  Data
df <- data.frame(
  V1 = c("80538 München, Germany", "01328 Dresden, Germany", "80538 München, Germany",
         "07745 Jena, Germany",    "10117 Berlin, Germany"),
  V2 = c("82152 Planegg, Germany", "01069 Dresden, Germany", "82152 Planegg, Germany",
         "07743 Jena, Germany",    "14195 Berlin, Germany"),
  V3 = c("85748 Garching, Germany", "01069 Dresden, Germany",  "85748 Garching, Germany",
         NA,     "10318 Berlin, Germany"),
  V4 = c("80805 München, Germany", "01187 Dresden, Germany", "80805 München, Germany",
         "07745 Jena, Germany", NA), stringsAsFactors=FALSE
)

#replace NA for geocode-funktion
df[is.na(df)] <- ""

#slice it
df1 <- slice(df, 5:5)

#  lon lat Informations
df_2 <- geocode(c(df1$V1, df1$V2,df1$V3, df1$V4)) %>% na.omit()

# to Matrix
mat_df  <- as.matrix(df_2)

#dist-mat
dist_mat <- distm(mat_df)

#mean-dist of row 5
mean(dist_mat[lower.tri(dist_mat)])/1000  
```

Unfortunately, I fail to implement a function that executes the code for an entire data set. My current problem is, that the function does not calculate the distance-averages rowwise, but calculates the average value from all lines of the data set.

```r
#Funktion

Mean_Dist <- function(df,w,x,y,z) {
 
  # for (row in 1:nrow(df)) {
  #   dist_mat <- geocode(c(w, x, y, z))
  #  
  # }
 
  df <- geocode(c(w, x, y, z)) %>% na.omit() # ziehe lon lat Informationen aus Adressen
 
  mat_df <- as.matrix(df) # schreibe diese in eine Matrix
 
  dist_mat <- distm(mat_df)
 
  dist_mean <- mean(dist_mat[lower.tri(dist_mat)])
 
  return(dist_mean)
}

df %>%  mutate(lon =  Mean_Dist(df,df$V1, df$V2,df$V3, df$V4)/1000)

```
Do you have any idea what mistake I made?

to clarify my question: What I'm trying to create a dataframe like this one (V5):

```r
  V1                     V2                     V3                      V4                      V5                    
  <chr>                  <chr>                  <chr>                   <chr>                   <numeric>                
1 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany 80805 München, Germany Mean_Dist_row1
2 01328 Dresden, Germany 01069 Dresden, Germany 01069 Dresden, Germany  01187 Dresden, Germany Mean_Dist_row2
3 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany 80805 München, Germany Mean_Dist_row3
4 07745 Jena, Germany    07743 Jena, Germany    07745 Jena, Germany     07745 Jena, Germany Mean_Dist_row4  
5 10117 Berlin, Germany  14195 Berlin, Germany  10318 Berlin, Germany   14476 Potsdam, Germany Mean_Dist_row5
```

eg an average of the distance of each row.
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Average distance in kilometers between subsets of points with ggmap /geosphere

Eric Berger
Hi Malte,
I only skimmed your question and looked at the desired output.
I wondered if the apply function could meet your needs.
Here's a small example that might help you:

m <- matrix(1:9,nrow=3)
m <- cbind(m,apply(m,MAR=1,mean))  # MAR=1 says to apply the function
row-wise
m

#         [,1] [,2] [,3] [,4]
# [1,]    1    4    7    4
# [2,]    2    5    8    5
# [3,]    3    6    9    6

HTH,
Eric


On Mon, Sep 23, 2019 at 10:18 AM Malte Hückstädt <
[hidden email]> wrote:

> I would like to determine the geographical distances from a number of
> addresses and determine the mean value (the mean distance) from these.
>
> In case the dataframe has only one row, I have found a solution:
>
> ```r
> # Pakete laden
> library(readxl)
> library(openxlsx)
> library(googleway)
> #library(sf)
> library(tidyverse)
> library(geosphere)
> library("ggmap")
>
> #API Key bestimmen
> set_key("")
> api_key <- ""
> register_google(key=api_key)
>
> #  Data
> df <- data.frame(
>   V1 = c("80538 München, Germany", "01328 Dresden, Germany", "80538
> München, Germany",
>          "07745 Jena, Germany",    "10117 Berlin, Germany"),
>   V2 = c("82152 Planegg, Germany", "01069 Dresden, Germany", "82152
> Planegg, Germany",
>          "07743 Jena, Germany",    "14195 Berlin, Germany"),
>   V3 = c("85748 Garching, Germany", "01069 Dresden, Germany",  "85748
> Garching, Germany",
>          NA,     "10318 Berlin, Germany"),
>   V4 = c("80805 München, Germany", "01187 Dresden, Germany", "80805
> München, Germany",
>          "07745 Jena, Germany", NA), stringsAsFactors=FALSE
> )
>
> #replace NA for geocode-funktion
> df[is.na(df)] <- ""
>
> #slice it
> df1 <- slice(df, 5:5)
>
> #  lon lat Informations
> df_2 <- geocode(c(df1$V1, df1$V2,df1$V3, df1$V4)) %>% na.omit()
>
> # to Matrix
> mat_df  <- as.matrix(df_2)
>
> #dist-mat
> dist_mat <- distm(mat_df)
>
> #mean-dist of row 5
> mean(dist_mat[lower.tri(dist_mat)])/1000
> ```
>
> Unfortunately, I fail to implement a function that executes the code for
> an entire data set. My current problem is, that the function does not
> calculate the distance-averages rowwise, but calculates the average value
> from all lines of the data set.
>
> ```r
> #Funktion
>
> Mean_Dist <- function(df,w,x,y,z) {
>
>   # for (row in 1:nrow(df)) {
>   #   dist_mat <- geocode(c(w, x, y, z))
>   #
>   # }
>
>   df <- geocode(c(w, x, y, z)) %>% na.omit() # ziehe lon lat Informationen
> aus Adressen
>
>   mat_df <- as.matrix(df) # schreibe diese in eine Matrix
>
>   dist_mat <- distm(mat_df)
>
>   dist_mean <- mean(dist_mat[lower.tri(dist_mat)])
>
>   return(dist_mean)
> }
>
> df %>%  mutate(lon =  Mean_Dist(df,df$V1, df$V2,df$V3, df$V4)/1000)
>
> ```
> Do you have any idea what mistake I made?
>
> to clarify my question: What I'm trying to create a dataframe like this
> one (V5):
>
> ```r
>   V1                     V2                     V3
> V4                      V5
>   <chr>                  <chr>                  <chr>
>  <chr>                   <numeric>
> 1 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany
> 80805 München, Germany Mean_Dist_row1
> 2 01328 Dresden, Germany 01069 Dresden, Germany 01069 Dresden, Germany
> 01187 Dresden, Germany Mean_Dist_row2
> 3 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany
> 80805 München, Germany Mean_Dist_row3
> 4 07745 Jena, Germany    07743 Jena, Germany    07745 Jena, Germany
>  07745 Jena, Germany Mean_Dist_row4
> 5 10117 Berlin, Germany  14195 Berlin, Germany  10318 Berlin, Germany
>  14476 Potsdam, Germany Mean_Dist_row5
> ```
>
> eg an average of the distance of each row.
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Average distance in kilometers between subsets of points with ggmap /geosphere

Eric Berger
You are welcome

On Tue, Sep 24, 2019 at 9:10 AM Malte Hückstädt <
[hidden email]> wrote:

> Hello Eric, thanks a lot!In fact, your tip helped me a lot. I have now
> found a solution with lappy and apply. Thank you very much!
>
> regards, malte
>
>
> Am 23.09.2019 um 09:32 schrieb Eric Berger <[hidden email]>:
>
> Hi Malte,
> I only skimmed your question and looked at the desired output.
> I wondered if the apply function could meet your needs.
> Here's a small example that might help you:
>
> m <- matrix(1:9,nrow=3)
> m <- cbind(m,apply(m,MAR=1,mean))  # MAR=1 says to apply the function
> row-wise
> m
>
> #         [,1] [,2] [,3] [,4]
> # [1,]    1    4    7    4
> # [2,]    2    5    8    5
> # [3,]    3    6    9    6
>
> HTH,
> Eric
>
>
> On Mon, Sep 23, 2019 at 10:18 AM Malte Hückstädt <
> [hidden email]> wrote:
>
>> I would like to determine the geographical distances from a number of
>> addresses and determine the mean value (the mean distance) from these.
>>
>> In case the dataframe has only one row, I have found a solution:
>>
>> ```r
>> # Pakete laden
>> library(readxl)
>> library(openxlsx)
>> library(googleway)
>> #library(sf)
>> library(tidyverse)
>> library(geosphere)
>> library("ggmap")
>>
>> #API Key bestimmen
>> set_key("")
>> api_key <- ""
>> register_google(key=api_key)
>>
>> #  Data
>> df <- data.frame(
>>   V1 = c("80538 München, Germany", "01328 Dresden, Germany", "80538
>> München, Germany",
>>          "07745 Jena, Germany",    "10117 Berlin, Germany"),
>>   V2 = c("82152 Planegg, Germany", "01069 Dresden, Germany", "82152
>> Planegg, Germany",
>>          "07743 Jena, Germany",    "14195 Berlin, Germany"),
>>   V3 = c("85748 Garching, Germany", "01069 Dresden, Germany",  "85748
>> Garching, Germany",
>>          NA,     "10318 Berlin, Germany"),
>>   V4 = c("80805 München, Germany", "01187 Dresden, Germany", "80805
>> München, Germany",
>>          "07745 Jena, Germany", NA), stringsAsFactors=FALSE
>> )
>>
>> #replace NA for geocode-funktion
>> df[is.na(df)] <- ""
>>
>> #slice it
>> df1 <- slice(df, 5:5)
>>
>> #  lon lat Informations
>> df_2 <- geocode(c(df1$V1, df1$V2,df1$V3, df1$V4)) %>% na.omit()
>>
>> # to Matrix
>> mat_df  <- as.matrix(df_2)
>>
>> #dist-mat
>> dist_mat <- distm(mat_df)
>>
>> #mean-dist of row 5
>> mean(dist_mat[lower.tri(dist_mat)])/1000
>> ```
>>
>> Unfortunately, I fail to implement a function that executes the code for
>> an entire data set. My current problem is, that the function does not
>> calculate the distance-averages rowwise, but calculates the average value
>> from all lines of the data set.
>>
>> ```r
>> #Funktion
>>
>> Mean_Dist <- function(df,w,x,y,z) {
>>
>>   # for (row in 1:nrow(df)) {
>>   #   dist_mat <- geocode(c(w, x, y, z))
>>   #
>>   # }
>>
>>   df <- geocode(c(w, x, y, z)) %>% na.omit() # ziehe lon lat
>> Informationen aus Adressen
>>
>>   mat_df <- as.matrix(df) # schreibe diese in eine Matrix
>>
>>   dist_mat <- distm(mat_df)
>>
>>   dist_mean <- mean(dist_mat[lower.tri(dist_mat)])
>>
>>   return(dist_mean)
>> }
>>
>> df %>%  mutate(lon =  Mean_Dist(df,df$V1, df$V2,df$V3, df$V4)/1000)
>>
>> ```
>> Do you have any idea what mistake I made?
>>
>> to clarify my question: What I'm trying to create a dataframe like this
>> one (V5):
>>
>> ```r
>>   V1                     V2                     V3
>> V4                      V5
>>   <chr>                  <chr>                  <chr>
>>  <chr>                   <numeric>
>> 1 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany
>> 80805 München, Germany Mean_Dist_row1
>> 2 01328 Dresden, Germany 01069 Dresden, Germany 01069 Dresden, Germany
>> 01187 Dresden, Germany Mean_Dist_row2
>> 3 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany
>> 80805 München, Germany Mean_Dist_row3
>> 4 07745 Jena, Germany    07743 Jena, Germany    07745 Jena, Germany
>>  07745 Jena, Germany Mean_Dist_row4
>> 5 10117 Berlin, Germany  14195 Berlin, Germany  10318 Berlin, Germany
>>  14476 Potsdam, Germany Mean_Dist_row5
>> ```
>>
>> eg an average of the distance of each row.
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> <http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.