Clustering question \ dist(datmat)

6 messages
Open this post in threaded view
|

Clustering question \ dist(datmat)

 Hello everybody. I am trying to cluster circular data (data points which are angles), thus i can not use the "dist" function in "mclust" to generate my distance matrix, I am using the function " Dij = 0.5*( 1 - cos(theta_i - theta_j)). The thing is "hclust" will not accept this distance matrix, i tried to put it in a data frame, but again i get an error message saying " Error in if (n < 2) stop("must have n >= 2 objects to cluster") : argument is of length zero". The distance matrix "dist" producing is a lower triangular one, mine is a square matrix, which i think does not matter. My question how to make "hclust" process my distance matrix, what i am doing wrong. I am sure the problem is with the distance matrix format, Any suggestions are highly apprciated, the code below shows what i have done.       clust1<- as.vector(rvm(5,5,15)) clust2<- as.vector(rvm(5,10,15)) clust3<- as.vector(rvm(5,15,15)) clust4<- as.vector(rvm(5,20,15)) clust5<- as.vector(rvm(5,25,15)) data1<- rbind(clust1,clust2,clust3,clust4,clust5) datmat<- matrix(data1,nrow=25,ncol=1,byrow=TRUE) circ.plot(datmat)     df<- array(dim=c(25,25))     for (i in 1:25){        for (j in 1:25){     df[i,j]<- 0.5*(1 - cos(datmat[i] - datmat[j]))           }           }                   hcA<-hclust(df,method="average")   ****************************************************   Ahmed   Florida                 ---------------------------------         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Open this post in threaded view
|

Re: Clustering question \ dist(datmat)

 A distance matrix must be of class "dist".  Try hclust(as.dist(df)) On 3/26/06, kumar zaman <[hidden email]> wrote: > Hello everybody. I am trying to cluster circular data (data points which are angles), thus i can not use the "dist" function in "mclust" to generate my distance matrix, I am using the function " Dij = 0.5*( 1 - cos(theta_i - theta_j)). The thing is "hclust" will not accept this distance matrix, i tried to put it in a data frame, but again i get an error message saying " Error in if (n < 2) stop("must have n >= 2 objects to cluster") : argument is of length zero". The distance matrix "dist" producing is a lower triangular one, mine is a square matrix, which i think does not matter. My question how to make "hclust" process my distance matrix, what i am doing wrong. I am sure the problem is with the distance matrix format, Any suggestions are highly apprciated, the code below shows what i have done. > >  clust1<- as.vector(rvm(5,5,15)) > clust2<- as.vector(rvm(5,10,15)) > clust3<- as.vector(rvm(5,15,15)) > clust4<- as.vector(rvm(5,20,15)) > clust5<- as.vector(rvm(5,25,15)) > data1<- rbind(clust1,clust2,clust3,clust4,clust5) > datmat<- matrix(data1,nrow=25,ncol=1,byrow=TRUE) > circ.plot(datmat) >    df<- array(dim=c(25,25)) >    for (i in 1:25){ >       for (j in 1:25){ >    df[i,j]<- 0.5*(1 - cos(datmat[i] - datmat[j])) >          } >          } > hcA<-hclust(df,method="average") >  **************************************************** >  Ahmed >  Florida > > > --------------------------------- > >        [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html> ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Open this post in threaded view
|

Re: Clustering question \ dist(datmat)

 Dear Gabor and all ;       I know this will work; but i already have a distance matrix calculated using my distance measure Dij = 0.5 * ( 1 - cos(theta_i - theta_j)), if i do hclust(as.dist(df)) then i am taking distance another time for a matrix " df " which is supposed to be a distance matrix, i hope i am clear ;       ps: I just found out i can use " kmeans(df, 3, iter.max=100)" it will take df as calculated by Dij. I still need to use methods in hclust like " single, average, ward, median, mcquitty, ...etc"       Thank u anyway. Gabor Grothendieck <[hidden email]> wrote:   A distance matrix must be of class "dist". Try hclust(as.dist(df)) On 3/26/06, kumar zaman wrote: > Hello everybody. I am trying to cluster circular data (data points which are angles), thus i can not use the "dist" function in "mclust" to generate my distance matrix, I am using the function " Dij = 0.5*( 1 - cos(theta_i - theta_j)). The thing is "hclust" will not accept this distance matrix, i tried to put it in a data frame, but again i get an error message saying " Error in if (n < 2) stop("must have n >= 2 objects to cluster") : argument is of length zero". The distance matrix "dist" producing is a lower triangular one, mine is a square matrix, which i think does not matter. My question how to make "hclust" process my distance matrix, what i am doing wrong. I am sure the problem is with the distance matrix format, Any suggestions are highly apprciated, the code below shows what i have done. > > clust1<- as.vector(rvm(5,5,15)) > clust2<- as.vector(rvm(5,10,15)) > clust3<- as.vector(rvm(5,15,15)) > clust4<- as.vector(rvm(5,20,15)) > clust5<- as.vector(rvm(5,25,15)) > data1<- rbind(clust1,clust2,clust3,clust4,clust5) > datmat<- matrix(data1,nrow=25,ncol=1,byrow=TRUE) > circ.plot(datmat) > df<- array(dim=c(25,25)) > for (i in 1:25){ > for (j in 1:25){ > df[i,j]<- 0.5*(1 - cos(datmat[i] - datmat[j])) > } > } > hcA<-hclust(df,method="average") > **************************************************** > Ahmed > Florida > > > --------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html> Ahmed Albatineh,PhD Assistant Professor of Statistics Nova Southeastern University Fort Lauderdale, FL 33314 U.S.A                 ---------------------------------         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Open this post in threaded view
|

Re: Clustering question \ dist(datmat)

 In reply to this post by kumar zaman as.dist() does _not_ recompute the distances if given a matrix.  It simply takes the lower triangular portion of the distance matrix given and attach some attributes about the original dimension.  I don't think you need to object to that. Andy From: kumar zaman > > Dear Gabor and all ; >     >   I know this will work; but i already have a distance matrix > calculated using my distance measure Dij = 0.5 * ( 1 - > cos(theta_i - theta_j)), if i do hclust(as.dist(df)) then i > am taking distance another time for a matrix " df " which is > supposed to be a distance matrix, i hope i am clear ; >     >   ps: I just found out i can use " kmeans(df, 3, > iter.max=100)" it will take df as calculated by Dij. I still > need to use methods in hclust like " single, average, ward, > median, mcquitty, ...etc" >     >   Thank u anyway. > > > Gabor Grothendieck <[hidden email]> wrote: >   A distance matrix must be of class "dist". Try > > hclust(as.dist(df)) > > > On 3/26/06, kumar zaman wrote: > > Hello everybody. I am trying to cluster circular data (data points > > which are angles), thus i can not use the "dist" function > in "mclust" > > to generate my distance matrix, I am using the function " > Dij = 0.5*( > > 1 - cos(theta_i - theta_j)). The thing is "hclust" will not accept > > this distance matrix, i tried to put it in a data frame, > but again i > > get an error message saying " Error in if (n < 2) > stop("must have n >= > > 2 objects to cluster") : argument is of length zero". The distance > > matrix "dist" producing is a lower triangular one, mine is a square > > matrix, which i think does not matter. My question how to make > > "hclust" process my distance matrix, what i am doing wrong. > I am sure > > the problem is with the distance matrix format, Any suggestions are > > highly apprciated, the code below shows what i have done. > > > > clust1<- as.vector(rvm(5,5,15)) > > clust2<- as.vector(rvm(5,10,15)) > > clust3<- as.vector(rvm(5,15,15)) > > clust4<- as.vector(rvm(5,20,15)) > > clust5<- as.vector(rvm(5,25,15)) > > data1<- rbind(clust1,clust2,clust3,clust4,clust5) > > datmat<- matrix(data1,nrow=25,ncol=1,byrow=TRUE) > > circ.plot(datmat) > > df<- array(dim=c(25,25)) > > for (i in 1:25){ > > for (j in 1:25){ > > df[i,j]<- 0.5*(1 - cos(datmat[i] - datmat[j])) > > } > > } > > hcA<-hclust(df,method="average") > > **************************************************** > > Ahmed > > Florida > > > > > > --------------------------------- > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > [hidden email] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html> > > > > > Ahmed Albatineh,PhD > Assistant Professor of Statistics > Nova Southeastern University > Fort Lauderdale, FL 33314 > U.S.A > > --------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html> > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide! http://www.R-project.org/posting-guide.html