Cluster Analysis - Number of Clusters

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Cluster Analysis - Number of Clusters

John Janmaat
Hello,

I'm playing around with cluster analysis, and am looking for methods to
select the number of clusters.  I am aware of methods based on a 'pseudo
F' or a 'pseudo T^2'.  Are there packages in R that will generate these
statistics, and/or other statistics to aid in cluster number selection?

Thanks,

John.
--
===========================================================================
Dr. John Janmaat                       Tel: 902-585-1461
Department of Economics                Fax: 902-585-1070
Acadia University                      Email: [hidden email]
Wolfville, Nova Scotia, Canada.        Web: ace.acadiau.ca/~jjanmaat/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: Cluster Analysis - Number of Clusters

P. Olsson
Have you checked the amap package? It has been updated just recently and if
I am not wrong there is a method which indicates the best number of k groups
for your data.

Best wishes,
P. Olsson



2006/2/5, John Janmaat <[hidden email]>:

>
> Hello,
>
> I'm playing around with cluster analysis, and am looking for methods to
> select the number of clusters.  I am aware of methods based on a 'pseudo
> F' or a 'pseudo T^2'.  Are there packages in R that will generate these
> statistics, and/or other statistics to aid in cluster number selection?
>
> Thanks,
>
> John.
> --
>
> ===========================================================================
> Dr. John Janmaat                       Tel: 902-585-1461
> Department of Economics                Fax: 902-585-1070
> Acadia University                      Email: [hidden email]
> Wolfville, Nova Scotia, Canada.        Web: ace.acadiau.ca/~jjanmaat/
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: Cluster Analysis - Number of Clusters

Christian Hennig
In reply to this post by John Janmaat
Hi,

as said before, some statistics to estimate the number of clusters are in
the cluster.stats function of package fpc. These are distance-based,
not "pseudo F or T^2". They are documented in the book
of Gordon (1999) Classification (see ?cluster.stats for more references).
It also includes the average silhouette width of Kaufman and Rousseeuw
(1990) (exact reference in ?plot.agnes), which is also part of the output
of some functions in package cluster (pam, agnes,...?).

An alternative way to estimate the number of clusters is the use of the
BIC together with a (normal) mixture model, see package mclust.

Best,
Christian


On Sun, 5 Feb 2006, John Janmaat wrote:

> Hello,
>
> I'm playing around with cluster analysis, and am looking for methods to
> select the number of clusters.  I am aware of methods based on a 'pseudo
> F' or a 'pseudo T^2'.  Are there packages in R that will generate these
> statistics, and/or other statistics to aid in cluster number selection?
>
> Thanks,
>
> John.
> --
> ===========================================================================
> Dr. John Janmaat                       Tel: 902-585-1461
> Department of Economics                Fax: 902-585-1070
> Acadia University                      Email: [hidden email]
> Wolfville, Nova Scotia, Canada.        Web: ace.acadiau.ca/~jjanmaat/
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
[hidden email], www.homepages.ucl.ac.uk/~ucakche

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: Cluster Analysis - Number of Clusters

TEMPL Matthias
In reply to this post by John Janmaat
Dear John,

You can play around with cluster.stats function in library fpc, e.g. you
can try:

library(fpc)
library(cluster)
data(xclara)
dM <- dist(xclara)
cl <- vector()
for(i in 2:7){
  cl[i] <- cluster.stats(d=dM, clustering=clara(d,i)$cluster,
silhouette=FALSE)$wb.ratio
}
plot(1:6,cl[2:7], xaxt="n")
axis(1, at=1:6, labels=2:7)

(..takes some minutes time)
indicates that 3 clusters are "optimal" for this data.

Best,
Matthias


>
> Hello,
>
> I'm playing around with cluster analysis, and am looking for
> methods to
> select the number of clusters.  I am aware of methods based
> on a 'pseudo
> F' or a 'pseudo T^2'.  Are there packages in R that will
> generate these
> statistics, and/or other statistics to aid in cluster number
> selection?
>
> Thanks,
>
> John.
> --
> ==============================================================
> =============
> Dr. John Janmaat                       Tel: 902-585-1461
> Department of Economics                Fax: 902-585-1070
> Acadia University                      Email: [hidden email]
> Wolfville, Nova Scotia, Canada.        Web: ace.acadiau.ca/~jjanmaat/
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read
> the posting guide! http://www.R-project.org/posting-guide.html
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html