Dear R Community-
I am a new user of R. I am using R with GRASS GIS. I would apply svm "on" raster data in GRASS. Basically I have a raster with "areas training" and other three raster (each represents a band of ASTER satellite image). My goal is to classify, according to training areas, the 3 raster. Trying to replicate the guides found on the net, I did the following: # load raster Training<-readRAST6("Training") AST_L1B_1<-readRAST6("AST_L1B_1") AST_L1B_2<-readRAST6("AST_L1B_2") AST_L1B_3N<-readRAST6("AST_L1B_3N") #and then model_ASTER <- svm(Training_2006,AST_L1B_1,AST_L1B_2,AST_L1B_3N,type='C',kernel='linear') #but Errore in data.frame(y, x) : arguments imply differing number of rows: 1857076, 1488 Thanks for any help |
Gab,
Make sure you have variables for each training. training <- data.frame(Training_2006, AST_L1B_1, AST_L1B_2, AST_L1B_3N) If you can't do that, then you don't have as many training observations than you have predictive informations. Make sure to create a line for each set of predictive pixels corresponding to a training pixel. That should then work in svm(). Once you have a satisfying model, take the rest of your pixels (where you have no training) and make a prediction using the model. Hope this helps, Etienne 2012/2/14 gab <[hidden email]> > Dear R Community- > > I am a new user of R. I am using R with GRASS GIS. > I would apply svm "on" raster data in GRASS. > Basically I have a raster with "areas training" and other three raster > (each > represents a band of ASTER satellite image). > My goal is to classify, according to training areas, the 3 raster. > Trying to replicate the guides found on the net, I did the following: > # load raster > Training<-readRAST6("Training") > AST_L1B_1<-readRAST6("AST_L1B_1") > AST_L1B_2<-readRAST6("AST_L1B_2") > AST_L1B_3N<-readRAST6("AST_L1B_3N") > #and then > model_ASTER <- > svm(Training_2006,AST_L1B_1,AST_L1B_2,AST_L1B_3N,type='C',kernel='linear') > #but > Errore in data.frame(y, x) : > arguments imply differing number of rows: 1857076, 1488 > > Thanks for any help > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/svm-with-GRASS-GIS-tp4388006p4388006.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Ciao Etienne, thank you.
Today I tried to understand something more. Here's what I did (The file names are a bit different): training <- data.frame(cbind(TL_training_2006_id, AST_L1B_2008_05_2009_area_giusta_1, AST_L1B_2008_05_2009_area_giusta_2, AST_L1B_2008_05_2009_area_giusta_3N)) Then ... x <- subset(training, select = TL_training_2006_id) y <- subset(training, select = -TL_training_2006_id) and finally .... model_ASTER <- svm(x,y) but I get :/ Errore in scale(newdata[, object$scaled, drop = FALSE], center = object$x.scale$"scaled:center", : (subscript) indice logicol troppo lungo thanks for any help |
2012/2/15 gab <[hidden email]>
> > Errore in scale(newdata[, object$scaled, drop = FALSE], center = > object$x.scale$"scaled:center", : > (subscript) indice logicol troppo lungo > I'm pretty sure the problem is with your data frame. Maybe if you share the result of dput(training[1:10, ]) # (make sure to include enough relevant lines) that could allow to see the structure of your data. -- Etienne [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Dear Etienne, I'm a colleauge of Gabriele and I'm more into R (but he is more into GRASS).
I'll try to explain you what we didi so far 1) Our ASTER images, (B1, B2 and B3) have 8363134 pixels; we made a subset in order to have training data sets: that is, for each band (B1,B2 and B3) 916 pixels were extracted (through QGIS and GRASS-gis command line) so that we finally had three images with 916 pixels and the rest were NA. We called them as follow: B1_train; B2_train; B3_train in R they were imported as SpatialGridDataFrame (SGDF) and they have: 8363134 obs. of 1 variable (with 8362218 NA's and 916 values) 2) then, we created a data frame containing the above mentioned SGDFs >training <- data.frame(TL_training_2006_id, B1_train, B2_train, B3_train) where the first variables contains the classes for he classification names(training) [1] "TL_training_2006_id" "x" "y" [4] "B1_train" "x.1" "y.1" [7] "B2_train" "x.2" "y.2" [10] "B3_train" "x.3" "y.3" str(training) 'data.frame': 916 obs. of 12 variables: $ TL_training_2006_id: int 7 7 7 7 7 7 7 7 7 7 ... $ x : num 680239 680254 680269 680254 680269 ... $ y : num 4545534 4545519 4545519 4545504 4545504 ... $ B1_train : int 110 110 110 110 110 108 109 110 109 111 ... $ x.1 : num 680239 680254 680269 680254 680269 ... $ y.1 : num 4545534 4545519 4545519 4545504 4545504 ... $ B2_train : int 64 65 64 64 64 65 65 65 65 65 ... $ x.2 : num 680239 680254 680269 680254 680269 ... $ y.2 : num 4545534 4545519 4545519 4545504 4545504 ... $ B3_train : int 42 43 43 43 42 42 42 43 43 42 ... $ x.3 : num 680239 680254 680269 680254 680269 ... $ y.3 : num 4545534 4545519 4545519 4545504 4545504 ... 3) then we applied the svm() function to calibrate the model on the training data set model_ASTER3 <- svm(TL_training_2006_id ~ B1_train + B2_train + B3_train, data = training) 4) we created a data frame containing the three complete images for each band (that we want to classify) composition <- data.frame(B1, B2, B3) names(composition) [1] "B1" "x" "y" "B2" "x.1" "y.1" "B3" "x.2" "y.2" I removed the column containing coordinates and then I tried this pred <- predict(model_ASTER3, composition) but I got this error message : Error in model.frame.default(object, data, xlev = xlev) : object is not a matrix Thank you for your help! :) PS: dput(training[1:10, ]) structure(list(TL_training_2006_id = c(7L, 7L, 7L, 7L, 7L, 7L,7L, 7L, 7L, 7L), x = c(680239.441714673, 680254.438325991, 680269.434937309, 680254.438325991, 680269.434937309, 680284.431548628, 680299.428159946, 680254.438325991, 680269.434937309, 680284.431548628), y = c(4545534.08962597, 4545519.09315455, 4545519.09315455, 4545504.09668313, 4545504.09668313, 4545504.09668313, 4545504.09668313, 4545489.10021171, 4545489.10021171, 4545489.10021171), AST_L1B_2008_05_2009_area_giusta_1_training = c(110L, 110L, 110L, 110L, 110L, 108L, 109L, 110L, 109L, 111L), x.1 = c(680239.441714673, 680254.438325991, 680269.434937309, 680254.438325991, 680269.434937309, 680284.431548628, 680299.428159946, 680254.438325991, 680269.434937309, 680284.431548628), y.1 = c(4545534.08962597, 4545519.09315455, 4545519.09315455, 4545504.09668313, 4545504.09668313, 4545504.09668313, 4545504.09668313, 4545489.10021171, 4545489.10021171, 4545489.10021171), AST_L1B_2008_05_2009_area_giusta_2_training = c(64L, 65L, 64L, 64L, 64L, 65L, 65L, 65L, 65L, 65L), x.2 = c(680239.441714673, 680254.438325991, 680269.434937309, 680254.438325991, 680269.434937309, 680284.431548628, 680299.428159946, 680254.438325991, 680269.434937309, 680284.431548628), y.2 = c(4545534.08962597, 4545519.09315455, 4545519.09315455, 4545504.09668313, 4545504.09668313, 4545504.09668313, 4545504.09668313, 4545489.10021171, 4545489.10021171, 4545489.10021171), AST_L1B_2008_05_2009_area_giusta_3N_training = c(42L, 43L, 43L, 43L, 42L, 42L, 42L, 43L, 43L, 42L), x.3 = c(680239.441714673, 680254.438325991, 680269.434937309, 680254.438325991, 680269.434937309, 680284.431548628, 680299.428159946, 680254.438325991, 680269.434937309, 680284.431548628), y.3 = c(4545534.08962597, 4545519.09315455, 4545519.09315455, 4545504.09668313, 4545504.09668313, 4545504.09668313, 4545504.09668313, 4545489.10021171, 4545489.10021171, 4545489.10021171)), .Names = c("TL_training_2006_id", "x", "y", "AST_L1B_2008_05_2009_area_giusta_1_training", "x.1", "y.1", "AST_L1B_2008_05_2009_area_giusta_2_training", "x.2", "y.2", "AST_L1B_2008_05_2009_area_giusta_3N_training", "x.3", "y.3"), row.names = c(1328853L, 1331805L, 1331806L, 1334756L, 1334757L, 1334758L, 1334759L, 1337707L, 1337708L, 1337709L), class = "data.frame")
Giuseppe Calamita
PhD at CNR-IMAA Italian National Council of Research - Institute of Methodologies for Environmental Analysis, Tito Scalo -Potenza ITALY |
Look at ?predict.svm, you'll see that you need to provide a Matrix, not a data.frame.
Etienne |
Dear Ethienne, thanks a lot for your help.
We finally manage to perform the svm classification in this way: library(spgrass6) ; G <- gmeta6() TL_training_2006_id.raw<-readRAST6("TL_training_2006_id") # classes training area B1_B2_B3_train.raw<-readRAST6(c("AST_L1B_2008_05_2009_area_giusta_1_training","AST_L1B_2008_05_2009_area_giusta_2_training","AST_L1B_2008_05_2009_area_giusta_3N_training")) #bands training area B1_B2_B3_compl.raw<-readRAST6(c("AST_L1B_2008_05_2009_area_giusta_1","AST_L1B_2008_05_2009_area_giusta_2","AST_L1B_2008_05_2009_area_giusta_3N")) #bands, complete data #transform classes from numeric to factor is.numeric(TL_training_2006_id.raw@data$TL_training_2006_id) #TRUE class(TL_training_2006_id.raw@data$TL_training_2006_id) #numeric TL_training_2006_id.raw@data$TL_training_2006_id <- as.factor(TL_training_2006_id.raw@data$TL_training_2006_id) # create NA mask using complete.cases() TL_training_2006_id.na_mask <- complete.cases(TL_training_2006_id.raw@data) B1_B2_B3_train.na_mask <-complete.cases(B1_B2_B3_train.raw@data) B1_B2_B3_compl.na_mask <-complete.cases(B1_B2_B3_compl.raw@data) # get values based on na_mask TL_training_2006_id <- TL_training_2006_id.raw@data[TL_training_2006_id.na_mask, ] B1_B2_B3_train <- B1_B2_B3_train.raw@data[B1_B2_B3_train.na_mask, ] B1_B2_B3_compl <- B1_B2_B3_compl.raw@data[B1_B2_B3_compl.na_mask, ] # create SVM model library(e1071) x <- B1_B2_B3_train y <- TL_training_2006_id model_ASTER <- svm(x,y) #predict pred <- predict(model_ASTER, B1_B2_B3_compl.raw@data) #same as: pred <- predict(model_ASTER, B1_B2_B3_compl.raw@data[B1_B2_B3_compl.na_mask, ], locations=coordinates(utm_wgs84)) #now the issue is that the "pred" object is str(pred) Factor w/ 4 levels "2","3","4","5": 3 3 3 3 3 3 3 3 3 3 ... - attr(*, "names")= chr [1:920591] "24389" "24390" "24391" "25729" ... that is, it contains the predicted(classified) values but it is not an S4 object SGDF Do you have any advice on how to tranform it back in SGDF having the coordinates(B1_B2_B3_compl.raw)? Thankyou ! Giuseppe
Giuseppe Calamita
PhD at CNR-IMAA Italian National Council of Research - Institute of Methodologies for Environmental Analysis, Tito Scalo -Potenza ITALY |
I usually use a rasterLayer object (from raster package) instead of a SpatialGridDataFrame, but you probably just have to bind it to your data :
TL_training_2006_id.raw@data$prediction <- pred This will create a band in which you have your predictions. raster package doesn't handle the factors, so you have to use as.integer(), but it is probably the same. |
Hi Ethienne,
we (me and Gab) would like to thank you ! Finally we got what we were looking for but we did it without use the "raster" package....but we are going to try also with it to see if it allow to have faster computation or data manipulation, If you're interested we can show you the code we made to make things work. Goodbye and thank you again
Giuseppe Calamita
PhD at CNR-IMAA Italian National Council of Research - Institute of Methodologies for Environmental Analysis, Tito Scalo -Potenza ITALY |
Free forum by Nabble | Edit this page |