Calculating just a single row of dissimilarity/distance matrix

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Calculating just a single row of dissimilarity/distance matrix

Aerenbkts bkts
I have a data-frame with 30k rows and 10 features. I would like to
calculate distance matrix like below;

gower_dist <- daisy(data-frame, metric = "gower"),


This function returns whole dissimilarity matrix. I want to get just
the first row.
(Just distances of the first element in data-frame). How can I do it?
Do you have an idea?


Regards

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating just a single row of dissimilarity/distance matrix

Eric Berger
> first_row_dist <- as.numeric(gower_dist)[1:(attr(gower_dist,"Size")-1)]

will give you the distances of the first row from the subsequent rows.

HTH,
Eric





On Fri, Oct 26, 2018 at 4:07 PM Aerenbkts bkts <[hidden email]> wrote:

> I have a data-frame with 30k rows and 10 features. I would like to
> calculate distance matrix like below;
>
> gower_dist <- daisy(data-frame, metric = "gower"),
>
>
> This function returns whole dissimilarity matrix. I want to get just
> the first row.
> (Just distances of the first element in data-frame). How can I do it?
> Do you have an idea?
>
>
> Regards
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating just a single row of dissimilarity/distance matrix

Jan van der LAan-2
In reply to this post by Aerenbkts bkts

Using another implementation of the gower distance:


library(gower)

gower_dist(iris[1,], iris)


HTH,

Jan



On 26-10-18 15:07, Aerenbkts bkts wrote:

> I have a data-frame with 30k rows and 10 features. I would like to
> calculate distance matrix like below;
>
> gower_dist <- daisy(data-frame, metric = "gower"),
>
>
> This function returns whole dissimilarity matrix. I want to get just
> the first row.
> (Just distances of the first element in data-frame). How can I do it?
> Do you have an idea?
>
>
> Regards
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Calculating just a single row of dissimilarity/distance matrix

Jan van der LAan-2
Please respond to the list; there are more people answering there.


As explained in the documentation gower_dist performes a pairwise
comparison of the two arguments recycling the shortest one if needed, so
indeed gower_dist(iris[1:5, ], iris) doesn't do what you want.

Possible solutions are:

tmp <- split(iris[1:150, ], seq_len(150))

sapply(gower_dist, iris)


and:


library(dplyr)

library(tidyr)

pairs <- expand.grid(x = 1:5, y = 1:nrow(iris))
pairs$dist <- gower_dist(iris[pairs$x, ], iris[pairs$y, ])
pairs %>% spread(y, dist)

Don't know which one is faster. And there are probably various other
solutions too.

--
Jan





On 27-10-18 18:04, Aerenbkts bkts wrote:

> Dear Jan
>
> Thanks for your help. Actually it works for the first element. But I
> tried to calculate distance values for the first N rows. For example;
>
> gower_dist(iris[1:5,], iris) // gower distance for the first 5 rows.
> but it did not work. Do you have any suggestion about it?
>
>
>
> On Fri, 26 Oct 2018 at 21:31, Jan van der Laan <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>
>     Using another implementation of the gower distance:
>
>
>     library(gower)
>
>     gower_dist(iris[1,], iris)
>
>
>     HTH,
>
>     Jan
>
>
>
>     On 26-10-18 15:07, Aerenbkts bkts wrote:
>     > I have a data-frame with 30k rows and 10 features. I would like to
>     > calculate distance matrix like below;
>     >
>     > gower_dist <- daisy(data-frame, metric = "gower"),
>     >
>     >
>     > This function returns whole dissimilarity matrix. I want to get just
>     > the first row.
>     > (Just distances of the first element in data-frame). How can I
>     do it?
>     > Do you have an idea?
>     >
>     >
>     > Regards
>     >
>     >       [[alternative HTML version deleted]]
>     >
>     > ______________________________________________
>     > [hidden email] <mailto:[hidden email]> mailing list
>     -- To UNSUBSCRIBE and more, see
>     > https://stat.ethz.ch/mailman/listinfo/r-help
>     > PLEASE do read the posting guide
>     http://www.R-project.org/posting-guide.html
>     > and provide commented, minimal, self-contained, reproducible code.
>
>     ______________________________________________
>     [hidden email] <mailto:[hidden email]> mailing list --
>     To UNSUBSCRIBE and more, see
>     https://stat.ethz.ch/mailman/listinfo/r-help
>     PLEASE do read the posting guide
>     http://www.R-project.org/posting-guide.html
>     and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.