Hey,
I am a PhD student in forestry science, and I am a brand new in R. I am working with lidar data (cloud points with X, Y and Z value). I wish to create a spatial map with kriging form points cloud. My problem is the Big data-set (over 5,000,000.00 points) and I always went out of memory. Is there a script to create un subset or modify the radius of variogram? Thanks Ale [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
On Wednesday 16 July 2008, Alessandro wrote:
> Hey, > > > > I am a PhD student in forestry science, and I am a brand new in R. I am > working with lidar data (cloud points with X, Y and Z value). I wish to > create a spatial map with kriging form points cloud. My problem is the Big > data-set (over 5,000,000.00 points) and I always went out of memory. > > > > Is there a script to create un subset or modify the radius of variogram? > > > Do you have any reason to prefer kriging over some other, less intensive method such as RST (regularized splines with tension)? Check out GRASS or GMT for ideas on how to grid such a massive point set. Specifically the r.in.xyz and v.surf.rst modules from GRASS. Cheers, -- Dylan Beaudette Soil Resource Laboratory http://casoilresource.lawr.ucdavis.edu/ University of California at Davis 530.754.7341 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
On Wednesday 16 July 2008, Alessandro wrote:
> Hey Dylan, > > Thank you. I wish to test for my PhD: TIN (ok, with Arcmap), IDW (ok, with > Arcmap) and kriging model (and other if it is possible) to create DSM and > DEM, and DCM (DSM-DEM). I tried with gstat in IDRISI, but my PC outs of > memory. > I wish improve in R the gstat to develop map surface (in grid format for > idrisi or arcmap). Unfortunately I have the same problem in R (out of > memory), because the dataset is big. Therefore I wish create a random sub > sampling set by 5000,000.00 over points. > I show you my code (sorry I am a brand new in R) > > Data type (in *.txt format) > > X y X > ....... ....... ........ > ....... ....... ........ > > testground <- read.table > (file="c:/work_LIDAR_USA/R_kriging/ground26841492694149.txt", header=T, > sep=" ") > summary (testground) > plot(testground[,1],testground[,2]) > library (sp) > class (testground) > coordinates (testground)=~X+Y > library (gstat) > class (testground) > V <- variogram(z~1, testground) > > When I arrive in this step appear "out of memory" > > If do you help me, it's a very pleasure because I stopped my work. > > Ale > Hi Ale. Please remember to CC the list next time. Since R is memory-bound (for the most part), you should be summarizing your data first, then loading into R. If you can install GRASS, I would highly recommend using the r.in.xyz command to pre-grid your data to a reasonable cell size, such that the resulting raster will fit into memory. If you cannot, and can somehow manage to get the raw data into R, sampling random rows would work. # make some data: x <- 1:100000 # just some of the data sample(x, 100) # use this idea to extract x,y,z triplets # from some fake data: d <- data.frame(x=rnorm(100), y=rnorm(100), z=rnorm(100)) # select 10 random rows: rand_rows <- sample(1:nrow(d), 10) # just the selected rows: d.small <- d[rand_rows, ] keep in mind you will need enough memory to contain the original data AND your subset data. trash the original data once you have the subset data with rm(). As for the statistical implications of randomly sampling a point cloud for variogram analysis-- someone smarter than I may be helpful. Cheers, Dylan > > > -----Messaggio originale----- > Da: Dylan Beaudette [mailto:[hidden email]] > Inviato: mercoledì 16 luglio 2008 12.45 > A: [hidden email] > Cc: Alessandro > Oggetto: Re: [R] gstat problem with lidar data > > On Wednesday 16 July 2008, Alessandro wrote: > > Hey, > > > > > > > > I am a PhD student in forestry science, and I am a brand new in R. I am > > working with lidar data (cloud points with X, Y and Z value). I wish to > > create a spatial map with kriging form points cloud. My problem is the > > Big data-set (over 5,000,000.00 points) and I always went out of memory. > > > > > > > > Is there a script to create un subset or modify the radius of variogram? > > Do you have any reason to prefer kriging over some other, less intensive > method such as RST (regularized splines with tension)? > > Check out GRASS or GMT for ideas on how to grid such a massive point set. > Specifically the r.in.xyz and v.surf.rst modules from GRASS. > > Cheers, -- Dylan Beaudette Soil Resource Laboratory http://casoilresource.lawr.ucdavis.edu/ University of California at Davis 530.754.7341 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Ciao Dylan,
THANKS for your help. When I arrive in this step "V <- variogram(z~1, d.small)", appear this note: Error in gstat(formula = object, locations = locations, data = data) : l'argomento "data" non è specificato e non ha un valore predefinito (data argument it's not specified and it has not a value definied) I show you my code. I hope to improve this code in R, because I believe that R is a solution for this new kind of data (lidar). In fact, for ecological, hidrological and other application is more important to study many solution of processing and testing more software and procedures. Thank you again, your help is very important for me Ale *****************************************R********************************** ************************ > testground <- read. table (file="c:/work_LIDAR_USA/R_kriging/ground26841492694149.txt", header=T, sep=" ") > library (sp) > class (testground) [1] "data.frame" > coordinates (testground)=~X+Y > library (gstat) > class (testground) [1] "SpatialPointsDataFrame" attr(,"package") [1] "sp" > x <- 1:100000 > sample(x, 100) [1] 38465 18997 98968 56905 31535 5297 91034 57374 56148 4407 16033 74842 [13] 49516 91422 31812 94924 44332 30412 21990 61698 53816 51227 24848 26824 [25] 95203 20714 28172 60565 61309 24883 14063 19545 45505 24654 99649 92476 [37] 84208 73181 13319 1559 67268 13935 57486 4162 49480 68167 38897 33295 [49] 83067 47544 73390 9646 73967 81101 97055 96514 28011 99185 95511 98106 [61] 86564 9635 58078 72627 2634 77933 80923 19056 13540 30066 66614 35185 [73] 28856 61629 90387 30456 78108 18232 64321 68473 9021 15150 74326 17764 [85] 98459 38203 62364 86437 65911 14058 27638 86792 82157 13721 15988 62189 [97] 47190 912 33741 95151 > d <- data.frame(x=rnorm(100), y=rnorm(100), z=rnorm(100)) > rand_rows <- sample(1:nrow(d), 10) > d.small <- d[rand_rows, ] > summary (d.small) x y z Min. :-1.9838 Min. :-1.7096 Min. :-1.8724 1st Qu.:-0.5412 1st Qu.:-0.3629 1st Qu.:-1.3087 Median : 0.1373 Median : 0.3014 Median :-0.6858 Mean :-0.1825 Mean : 0.0811 Mean :-0.5395 3rd Qu.: 0.5796 3rd Qu.: 0.8645 3rd Qu.: 0.1156 Max. : 1.1075 Max. : 0.9342 Max. : 1.4642 > **************************************************** -----Messaggio originale----- Da: Dylan Beaudette [mailto:[hidden email]] Inviato: mercoledì 16 luglio 2008 14.23 A: Alessandro Cc: [hidden email] Oggetto: Re: R: [R] gstat problem with lidar data On Wednesday 16 July 2008, Alessandro wrote: > Hey Dylan, > > Thank you. I wish to test for my PhD: TIN (ok, with Arcmap), IDW (ok, with > Arcmap) and kriging model (and other if it is possible) to create DSM and > DEM, and DCM (DSM-DEM). I tried with gstat in IDRISI, but my PC outs of > memory. > I wish improve in R the gstat to develop map surface (in grid format for > idrisi or arcmap). Unfortunately I have the same problem in R (out of > memory), because the dataset is big. Therefore I wish create a random sub > sampling set by 5000,000.00 over points. > I show you my code (sorry I am a brand new in R) > > Data type (in *.txt format) > > X y X > ....... ....... ........ > ....... ....... ........ > > testground <- read.table > (file="c:/work_LIDAR_USA/R_kriging/ground26841492694149.txt", header=T, > sep=" ") > summary (testground) > plot(testground[,1],testground[,2]) > library (sp) > class (testground) > coordinates (testground)=~X+Y > library (gstat) > class (testground) > V <- variogram(z~1, testground) > > When I arrive in this step appear "out of memory" > > If do you help me, it's a very pleasure because I stopped my work. > > Ale > Hi Ale. Please remember to CC the list next time. Since R is memory-bound (for the most part), you should be summarizing your data first, then loading into R. If you can install GRASS, I would highly recommend using the r.in.xyz command to pre-grid your data to a reasonable cell size, such that the resulting raster will fit into memory. If you cannot, and can somehow manage to get the raw data into R, sampling random rows would work. # make some data: x <- 1:100000 # just some of the data sample(x, 100) # use this idea to extract x,y,z triplets # from some fake data: d <- data.frame(x=rnorm(100), y=rnorm(100), z=rnorm(100)) # select 10 random rows: rand_rows <- sample(1:nrow(d), 10) # just the selected rows: d.small <- d[rand_rows, ] keep in mind you will need enough memory to contain the original data AND your subset data. trash the original data once you have the subset data with rm(). As for the statistical implications of randomly sampling a point cloud for variogram analysis-- someone smarter than I may be helpful. Cheers, Dylan > > > -----Messaggio originale----- > Da: Dylan Beaudette [mailto:[hidden email]] > Inviato: mercoledì 16 luglio 2008 12.45 > A: [hidden email] > Cc: Alessandro > Oggetto: Re: [R] gstat problem with lidar data > > On Wednesday 16 July 2008, Alessandro wrote: > > Hey, > > > > > > > > I am a PhD student in forestry science, and I am a brand new in R. I am > > working with lidar data (cloud points with X, Y and Z value). I wish to > > create a spatial map with kriging form points cloud. My problem is the > > Big data-set (over 5,000,000.00 points) and I always went out of memory. > > > > > > > > Is there a script to create un subset or modify the radius of variogram? > > Do you have any reason to prefer kriging over some other, less intensive > method such as RST (regularized splines with tension)? > > Check out GRASS or GMT for ideas on how to grid such a massive point set. > Specifically the r.in.xyz and v.surf.rst modules from GRASS. > > Cheers, -- Dylan Beaudette Soil Resource Laboratory http://casoilresource.lawr.ucdavis.edu/ University of California at Davis 530.754.7341 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Free forum by Nabble | Edit this page |