Grouping on a Distance Matrix

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Grouping on a Distance Matrix

Dario Strbenac-2
Hello,

I'm looking for a function that groups elements below a certain distance threshold, based on a distance matrix. In other words, I'd like to group samples without using a standard clustering algorithm on the distance matrix. For example, let the distance matrix be :

      A     B     C     D
A     0  0.03  0.77  1.12  
B  0.03     0  1.59  1.11
C  0.77  1.59     0  0.09  
D  1.12  1.11  0.09     0

Two clusters would be found with a cutoff of 0.1. The first contains A,B. The second has C,D. Is there an efficient function that does this ? I can think of how to do this recursively, but am hoping it's already been considered.

--------------------------------------
Dario Strbenac
PhD Student
University of Sydney
Camperdown NSW 2050
Australia
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Grouping on a Distance Matrix

Bert Gunter
You need to re-think. What you said is nonsense. Use an appropriate
clustering algorithm.
(a can be near b; b can be near c; but a is not near c, using "near" =
closer than threshhold)

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch




On Thu, Feb 13, 2014 at 12:00 AM, Dario Strbenac
<[hidden email]> wrote:

> Hello,
>
> I'm looking for a function that groups elements below a certain distance threshold, based on a distance matrix. In other words, I'd like to group samples without using a standard clustering algorithm on the distance matrix. For example, let the distance matrix be :
>
>       A     B     C     D
> A     0  0.03  0.77  1.12
> B  0.03     0  1.59  1.11
> C  0.77  1.59     0  0.09
> D  1.12  1.11  0.09     0
>
> Two clusters would be found with a cutoff of 0.1. The first contains A,B. The second has C,D. Is there an efficient function that does this ? I can think of how to do this recursively, but am hoping it's already been considered.
>
> --------------------------------------
> Dario Strbenac
> PhD Student
> University of Sydney
> Camperdown NSW 2050
> Australia
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.