Quantcast

unable to run spatial lag and error models on large data

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

unable to run spatial lag and error models on large data

shishm
Banned User
Hi:
First my apologies for cross-posting. A few days back I posted my queries ar R-sig-geo but did not get any response. Hence this post.

I am working on two parcel-level housing dataset to estimate the impact of various variables on home sale prices.

I created the spatial weight metrics in ArcGIS 10 using sale
year of four nearest houses to assign weights.  Next, I ran LM tests and
 then ran the spatial lag and error models using spdep package.


I run into five issues.


Issue 1: When I weight the 10,000-observation first dataset, I get the following message: Non-symmetric neighbors list.  

Is this going to pose problems while running the regression models? If yes, what can I do?


The code and the results are:
test1.csv <- read.csv("C:/Article/Housing1/NHspwt.csv")

class(test1.csv) <- c("spatial.neighbour", class(test1.csv))
of <- ordered(test1.csv$OID)
attr(test1.csv, "region.id") <- levels(of)
test1.csv$OID <- as.integer(of)
test1.csv$NID <- as.integer(ordered(test1.csv$NID))
attr(test1.csv, "n") <- length(unique(test1.csv$OID))

lw_test1.csv <- sn2listw(test1.csv)
lw_test1.csv$style <- "W"
lw_test1.csv

Characteristics of weights list object:
Neighbour list object:
Number of regions: 10740
Number of nonzero links: 42960
Percentage nonzero weights: 0.03724395
Average number of links: 4
Non-symmetric neighbours list

Weights style: W
Weights constants summary:
      n        nn    S0       S1       S2
W 10740 115347600 10740 3129.831 44853.33


Issue 2: The spatial lag and error models do not run. I get
the following message (the models runs on half the data, approx. 5,000
 observations.  However, I will like to use the entire sample).  

Error: cannot allocate vector of size 880.0 Mb
In addition: Warning messages:
1: In t.default(object) :
  Reached total allocation of 3004Mb: see help(memory.size)
2: In t.default(object) :
  Reached total allocation of 3004Mb: see help(memory.size)
3: In t.default(object) :
  Reached total allocation of 3004Mb: see help(memory.size)
4: In t.default(object) :
  Reached total allocation of 3004Mb: see help(memory.size)

The code for the lag model is:
> fmtypecurrentcombinedlag <-lagsarlm(fmtypecurrentcombined,
data = spssnew, lw_test1.csv, na.action=na.fail, type="lag",
method="eigen", quiet=TRUE, zero.policy=TRUE, interval = NULL,
tol.solve=1.0e-20)

When I am able to read the data file using filehash package.
 However, I still get the following error message when I run the models:
 Error in matrix(0, nrow = n, ncol = n) : too many elements specified


Issue 3: For the second dataset that contains approx.
100,000 observations, I get the following error message when I try to
run spatial lag or error models.
Error in matrix(0, nrow = n, ncol = n) : too many elements specified

The code is:
> fecurrentcombinedlag <-lagsarlm(fecurrentcombined, data =
spssall, lw_test2.csv, na.action=na.fail, type="lag", method="eigen",
quiet=NULL, zero.policy=TRUE, interval = NULL, tol.solve=1.0e-20)


Issue 5: When I run LM tests I get the test results but with
 the following message: Spatial weights matrix not row standardized.
 Should I be worried about this considering that I am using the
4-nearest neighbor rule?

The code is:
lm.LMtests(fmtypecurrent, lw_test1.csv, test=c("LMerr", "LMlag", "RLMerr",
"RLMlag", "SARMA"))

Thanks
Shishm
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: unable to run spatial lag and error models on large data

Roger Bivand
shish matt <shishm <at> yahoo.com> writes:

>
> Hi:
> First my apologies for cross-posting. A few days back I posted my queries ar
R-sig-geo but did not get any
> response. Hence this post.

Since your message never reached that list - did you check? - that isn't very
surprising. Try again properly on R-sig-geo.

>
> I am working on two parcel-level housing dataset to estimate the impact of
various variables on home sale
> prices.
>
> I created the spatial weight metrics in ArcGIS 10 using sale
> year of four nearest houses to assign weights.  

Create them in R, much less error prone. knn2nb(knearneigh()). A lot of problems
arise from badly imported weights, but yours below seem OK.

> Next, I ran LM tests and
>  then ran the spatial lag and error models using spdep package.
>
> I run into five issues.
>
> Issue 1: When I weight the 10,000-observation first dataset, I get the
following message: Non-symmetric
> neighbors list.  
>
> Is this going to pose problems while running the regression models? If yes,
what can I do?

What do you think? If you are using nearest neighbours, only a very unusual set
of points would give symmetric neighbours, and that would likely also have
subgraph problems.


>
> Issue 2: The spatial lag and error models do not run. I get
> the following message (the models runs on half the data, approx. 5,000
>  observations.  However, I will like to use the entire sample).  
>

Read the help pages, method= argument. For larger data sets, use "LU" or perhaps
"MC" when the weights are not symmetric.


> Error: cannot allocate vector of size 880.0 Mb
.solve=1.0e-20)
>
> When I am able to read the data file using filehash package.
>  However, I still get the following error message when I run the models:
>  Error in matrix(0, nrow = n, ncol = n) : too many elements specified

No idea, almost certainly caused by not reading the documentation. Try on the
original list.

> [[alternative HTML version deleted]]
>
>

DO follow the rule of not posting HTML!

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Roger Bivand
Department of Economics
NHH Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Loading...