

Hi all, I am sorry if this is a very basic quesion, but I have no experience with analyzing spatial data and could not find the right function/package quickly. Any hints would be much appreciated. I have a matrix of spatial point patterns like the one below and want to find the number of independent components (if that's the right term) in that matrix (or in that image).
x=matrix(c(0,1,0,0,0,
0,1,1,0,0,
0,0,0,0,0,
0,0,0,1,0,
0,0,0,1,0),nrow=5)
image(x)
I can find the number of populated points easily
table(x) #or more generally
sum(x!=0)
But I want to find the number of independent components. The answer in this example should be 2. There are three criteria to the function I am seeking:
1. Points that have a neighboring nonzero point should be counted as one contiguous component.
2. The function should respect that the matrix is projected on a torso. That is, points in the leftmost column border points in the rightmost column and points in the top row border points in the bottom row (if they are contiguous when you wrap the image around a cylinder).
3. The function should be fast/efficient since I need to run this over thousands of images/matrices.
Is anyone aware of an implementation of such a function?
Thanks much for your help,
Daniel


I am sure this can be written in a much more elegant and faster code.
One way I can think of, is to modify cell2nb code and create a new
function that creates the neighbour lists of only cells that are not
0. These are best directed to RsigGeo list.
However, the following might work.
library(spdep)
library(igraph)
> x=matrix(c(0,1,0,0,0,
> 0,1,1,0,0,
> 0,0,0,0,0,
> 0,0,0,1,0,
> 0,0,0,1,0),nrow=5)
a < nb2mat(cell2nb(nrow(x),ncol(x)), style="B", torus="TRUE")
g < delete.vertices(graph.adjacency(a), which(x!=1)1)
plot(g)
clusters(g)
Nikhil

Nikhil Kaza
Asst. Professor,
City and Regional Planning
University of North Carolina
[hidden email]
On Jun 20, 2010, at 7:17 PM, Daniel Malter wrote:
>
> Hi all, I am sorry if this is a very basic quesion, but I have no
> experience
> with analyzing spatial data and could not find the right function/
> package
> quickly. Any hints would be much appreciated. I have a matrix of
> spatial
> point patterns like the one below and want to find the number of
> independent
> components (if that's the right term) in that matrix (or in that
> image).
>
> x=matrix(c(0,1,0,0,0,
> 0,1,1,0,0,
> 0,0,0,0,0,
> 0,0,0,1,0,
> 0,0,0,1,0),nrow=5)
>
> image(x)
>
> I can find the number of populated points easily
>
> table(x) #or more generally
> sum(x!=0)
>
> But I want to find the number of independent components. The answer
> in this
> example should be 2. There are three criteria to the function I am
> seeking:
>
> 1. Points that have a neighboring nonzero point should be counted as
> one
> contiguous component.
>
> 2. The function should respect that the matrix is projected on a
> torso. That
> is, points in the leftmost column border points in the rightmost
> column and
> points in the top row border points in the bottom row (if they are
> contiguous when you wrap the image around a cylinder).
>
> 3. The function should be fast/efficient since I need to run this over
> thousands of images/matrices.
>
> Is anyone aware of an implementation of such a function?
>
> Thanks much for your help,
> Daniel
> 
> View this message in context: http://r.789695.n4.nabble.com/Spatialnumberofindependentcomponentstp2262018p2262018.html> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hi, thanks much. This works in principle. The corrected code is below:
a < nb2mat(cell2nb(nrow(x),ncol(x),torus=T), style="B")
g < delete.vertices(graph.adjacency(a), which(x!=1)1)
plot(g)
clusters(g)
the $no argument of the clusters(g) function is the sought after number. However, the function is very slow, and my machine runs out of memory (1G) for a 101 by 101 matrix.
Daniel


Instead of nb2mat try
as.spam.listw(nb2listw(cell2nb(...)))
this will coerce the adjacency matrix into a sparse matrix representation
saving lot of memory.
Nikhil
On Sun, Jun 20, 2010 at 10:27 PM, Daniel Malter < [hidden email]> wrote:
>
> Hi, thanks much. This works in principle. The corrected code is below:
>
> a < nb2mat(cell2nb(nrow(x),ncol(x),torus=T), style="B")
> g < delete.vertices(graph.adjacency(a), which(x!=1)1)
>
> plot(g)
> clusters(g)
>
> the $no argument of the clusters(g) function is the sought after number.
> However, the function is very slow, and my machine runs out of memory (1G)
> for a 101 by 101 matrix.
>
> Daniel
> 
> View this message in context:
> http://r.789695.n4.nabble.com/Spatialnumberofindependentcomponentstp2262018p2262090.html> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


as.spam.listw is an unknown function. Is it in a different package?
Daniel
other attached packages:
[1] spdep_0.511 coda_0.135 deldir_0.012 maptools_0.734 foreign_0.838 nlme_3.196 MASS_7.33
[8] Matrix_0.99937531 lattice_0.1726 boot_1.241 sp_0.964 igraph_0.5.3 RandomFields_1.3.41 svSocket_0.948
[15] TinnR_1.0.3 R2HTML_1.591 Hmisc_3.70 survival_2.357
loaded via a namespace (and not attached):
[1] cluster_1.12.1 grid_2.10.0 svMisc_0.956 tools_2.10.0


I just updated spdep and I see that as.spam.listw works. Below is
sessionInfo
Furthermore, it may be straightforward to condense the adjacency
matrix *before* converting to graph which may help a little bit. You
can profile the code and see which part needs speeding up.
library(spdep)
library(igraph)
x=matrix(c(0,1,0,0,0,
0,1,1,0,0,
0,0,0,0,0,
0,0,0,1,0,
0,0,0,1,0),nrow=5)
a < as.spam.listw(nb2listw(cell2nb(nrow(x),ncol(x),torus=T),
style="B"))
ind < which(x>0)
b < a[ind, ind]
g1 < graph.adjacency(b)
clusters(g1)$no

R version 2.11.1 (20100531)
i386appledarwin9.8.0
locale:
[1] en_US.UTF8/en_US.UTF8/C/C/en_US.UTF8/en_US.UTF8
attached base packages:
[1] stats graphics grDevices utils
[5] datasets methods base
other attached packages:
[1] igraph_0.5.3 spam_0.220
[3] spdep_0.511 coda_0.135
[5] deldir_0.012 maptools_0.734
[7] foreign_0.840 nlme_3.196
[9] MASS_7.36 Matrix_0.99937539
[11] lattice_0.188 boot_1.242
[13] sp_0.964
loaded via a namespace (and not attached):
[1] grid_2.11.1 tools_2.11.1
Nikhil Kaza
Asst. Professor,
City and Regional Planning
University of North Carolina
[hidden email]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


I was missing the spam library.
I did some testing with m x m matrices (see below). Computing 'a' is the villain. The computation time for 'a' is exponential in m. For a 100 by 100 matrix, the predicted time is about 20 seconds. Thus, 100,000 runs, would take about 23 days.
library(igraph)
library(spdep)
library(spam)
d=5:40
tim1=NULL
tim2=NULL
tim3=NULL
for(i in 1:length(d)){
x=rbinom(d[i]^2,1,0.5)
dim(x)=c(d[i],d[i])
tim1[i]=system.time(
a < as.spam.listw(nb2listw(cell2nb(nrow(x),ncol(x),torus=T), style="B"))
)[3]
tim2[i]=system.time(
g < delete.vertices(graph.adjacency(a), which(x!=1)1)
)[3]
tim3[i]=system.time(
clusters(g)
)[3]
}
reg=lm(log(tim1)~log(d))
par(mfcol=c(1,2))
plot(log(tim1)~log(d))
lines(x=log(d),y=coef(reg)[1]+coef(reg)[2]*log(d),type="l")
plot(tim1~d)
lines(x=d,y=exp(coef(reg)[1]+coef(reg)[2]*log(d)),type="l")
#estimated time for a 100 by 100 matrix
exp(predict(reg,newdata=data.frame(d=100)))
#observed time for a 100 by 100 matrix
d=100
x=rbinom(d^2,1,0.5)
dim(x)=c(d,d)
system.time(a < as.spam.listw(nb2listw(cell2nb(nrow(x),ncol(x),torus=T), style="B")))
Best,
Daniel

