Clustering and Rand Index - VS-KM

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Clustering and Rand Index - VS-KM

Mark Hempelmann
Dear WizaRds,

I have been trying to compute the adjusted Rand index as by Hubert/
Arabie, and could not correctly approach how to define a partition
object as in my last request yesterday.

With package fpc I try to work around the problem, using my original data:

mat <- matrix( c(6,7,8,2,3,4,12,14,14, 14,15,13,3,1,2,3,4,2,
15,3,10,5,11,7,13,6,1, 15,4,10,6,12,8,12,7,1), ncol=9, byrow=T )
rownames(mat) <- paste("v", 1:4, sep="" )

## and the given partitions:

p1=c(1,1,1,2,2,2,3,3,3)
p2=c(1,1,1,3,2,2,3,3,2)
p3=c(1,2,1,3,1,3,1,3,2)
p4=c(1,2,1,3,1,3,1,3,2)

## Now

cluster.stats(d=dist(mat), clustering=p1, alt.clustering=p2)

## just gives
Error in as.dist(dmat[clustering == i, clustering == i]) :
        (subscript) logical subscript too long

I think I don't understand the use of 'd' here. How can I calculate the
corrected Rand matrix:
( .000  .407 -.071 -.071)
( .407  .000 -.071 -.071)
(-.071 -.071  .000 1.000)
(-.071 -.071 1.000  .000)

Does the clue package help me here? Does anyone know if there is a VS-KM
algorithm (Variable Selection Heuristic for K-Means Clustering)
implemented in R? Unfortunately, I did not find any serach entries.

Thank you for your help and support
Mark

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: Clustering and Rand Index - VS-KM

Ales Ziberna-2
You can comput t<he adjusted Rand with function classAgreement form package
e1071:
classAgreement(table(p1,p2))$crand

You can also use
 cluster.stats(d=dist(t(mat)), clustering=p1, alt.clustering=p2)

However in your code below, the orientation of mat is wrong (that's why
there is a "t()" around the mat in my code above). The variables should be
represented by rows and the cases by columns.

Best,
Ales Ziberna



-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Mark Hempelmann
Sent: Monday, January 09, 2006 12:43 AM
To: [hidden email]
Subject: [R] Clustering and Rand Index - VS-KM

Dear WizaRds,

I have been trying to compute the adjusted Rand index as by Hubert/ Arabie,
and could not correctly approach how to define a partition object as in my
last request yesterday.

With package fpc I try to work around the problem, using my original data:

mat <- matrix( c(6,7,8,2,3,4,12,14,14, 14,15,13,3,1,2,3,4,2,
15,3,10,5,11,7,13,6,1, 15,4,10,6,12,8,12,7,1), ncol=9, byrow=T )
rownames(mat) <- paste("v", 1:4, sep="" )

## and the given partitions:

p1=c(1,1,1,2,2,2,3,3,3)
p2=c(1,1,1,3,2,2,3,3,2)
p3=c(1,2,1,3,1,3,1,3,2)
p4=c(1,2,1,3,1,3,1,3,2)

## Now

cluster.stats(d=dist(mat), clustering=p1, alt.clustering=p2)

## just gives
Error in as.dist(dmat[clustering == i, clustering == i]) :
        (subscript) logical subscript too long

I think I don't understand the use of 'd' here. How can I calculate the
corrected Rand matrix:
( .000  .407 -.071 -.071)
( .407  .000 -.071 -.071)
(-.071 -.071  .000 1.000)
(-.071 -.071 1.000  .000)

Does the clue package help me here? Does anyone know if there is a VS-KM
algorithm (Variable Selection Heuristic for K-Means Clustering) implemented
in R? Unfortunately, I did not find any serach entries.

Thank you for your help and support
Mark

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html