Recursive Feature Elimination with SVM

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Recursive Feature Elimination with SVM

pp2019
This post was updated on .
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Recursive Feature Elimination with SVM

David Winsemius

On 1/1/19 4:40 AM, Priyanka Purkayastha wrote:
> I have a dataset (data) with 700 rows and 7000 columns. I am trying to do
> recursive feature selection with the SVM model. A quick google search
> helped me get a code for a recursive search with SVM. However, I am unable
> to understand the first part of the code, How do I introduce my dataset in
> the code?


Generally the "labels" is given to such a machine learning device as the
y argument, while the "features" are passed as a matrix to the x argument.


--

David.

>
> If the dataset is a matrix, named data. Please give me an example for
> recursive feature selection with SVM. Bellow is the code I got for
> recursive feature search.
>
>      svmrfeFeatureRanking = function(x,y){
>
>      #Checking for the variables
>      stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE)
>
>      n = ncol(x)
>      survivingFeaturesIndexes = seq_len(n)
>      featureRankedList = vector(length=n)
>      rankedFeatureIndex = n
>
>      while(length(survivingFeaturesIndexes)>0){
>      #train the support vector machine
>      svmModel = svm(x[, survivingFeaturesIndexes], y, cost = 10,
> cachesize=500,
>                  scale=FALSE, type="C-classification", kernel="linear" )
>
>      #compute the weight vector
>      w = t(svmModel$coefs)%*%svmModel$SV
>
>      #compute ranking criteria
>      rankingCriteria = w * w
>
>      #rank the features
>      ranking = sort(rankingCriteria, index.return = TRUE)$ix
>
>      #update feature ranked list
>      featureRankedList[rankedFeatureIndex] =
> survivingFeaturesIndexes[ranking[1]]
>      rankedFeatureIndex = rankedFeatureIndex - 1
>
>      #eliminate the feature with smallest ranking criterion
>      (survivingFeaturesIndexes = survivingFeaturesIndexes[-ranking[1]])}
>      return (featureRankedList)}
>
>
>
> I tried taking an idea from the above code and incorporate the idea in my
> code as shown below
>
>      library(e1071)
>      library(caret)
>
>      data<- read.csv("matrix.csv", header = TRUE)
>
>      x <- data
>      y <- as.factor(data$Class)
>
>      svmrfeFeatureRanking = function(x,y){
>
>        #Checking for the variables
>        stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE)
>
>        n = ncol(x)
>        survivingFeaturesIndexes = seq_len(n)
>        featureRankedList = vector(length=n)
>        rankedFeatureIndex = n
>
>        while(length(survivingFeaturesIndexes)>0){
>          #train the support vector machine
>          svmModel = svm(x[, survivingFeaturesIndexes], y, cross=10,cost =
> 10, type="C-classification", kernel="linear" )
>
>          #compute the weight vector
>          w = t(svmModel$coefs)%*%svmModel$SV
>
>          #compute ranking criteria
>          rankingCriteria = w * w
>
>          #rank the features
>          ranking = sort(rankingCriteria, index.return = TRUE)$ix
>
>          #update feature ranked list
>          featureRankedList[rankedFeatureIndex] =
> survivingFeaturesIndexes[ranking[1]]
>          rankedFeatureIndex = rankedFeatureIndex - 1
>
>          #eliminate the feature with smallest ranking criterion
>          (survivingFeaturesIndexes = survivingFeaturesIndexes[-ranking[1]])}
>
>        return (featureRankedList)}
>
> But couldn't do anything at the stage "update feature ranked list"
> Please guide
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Recursive Feature Elimination with SVM

pp2019
This post was updated on .
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Recursive Feature Elimination with SVM

David Winsemius

On 1/1/19 5:31 PM, Priyanka Purkayastha wrote:
> Thankyou David.. I tried the same, I gave x as the data matrix and y
> as the class label. But it returned an empty "featureRankedList". I
> get no output when I try the code.


If you want people to spend time on this you should post a reproducible
example. See the Posting Guide ... and learn to post in plain text.


--

David

>
> On Tue, 1 Jan 2019 at 11:42 PM, David Winsemius
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>
>     On 1/1/19 4:40 AM, Priyanka Purkayastha wrote:
>     > I have a dataset (data) with 700 rows and 7000 columns. I am
>     trying to do
>     > recursive feature selection with the SVM model. A quick google
>     search
>     > helped me get a code for a recursive search with SVM. However, I
>     am unable
>     > to understand the first part of the code, How do I introduce my
>     dataset in
>     > the code?
>
>
>     Generally the "labels" is given to such a machine learning device
>     as the
>     y argument, while the "features" are passed as a matrix to the x
>     argument.
>
>
>     --
>
>     David.
>
>     >
>     > If the dataset is a matrix, named data. Please give me an
>     example for
>     > recursive feature selection with SVM. Bellow is the code I got for
>     > recursive feature search.
>     >
>     >      svmrfeFeatureRanking = function(x,y){
>     >
>     >      #Checking for the variables
>     >      stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE)
>     >
>     >      n = ncol(x)
>     >      survivingFeaturesIndexes = seq_len(n)
>     >      featureRankedList = vector(length=n)
>     >      rankedFeatureIndex = n
>     >
>     >      while(length(survivingFeaturesIndexes)>0){
>     >      #train the support vector machine
>     >      svmModel = svm(x[, survivingFeaturesIndexes], y, cost = 10,
>     > cachesize=500,
>     >                  scale=FALSE, type="C-classification",
>     kernel="linear" )
>     >
>     >      #compute the weight vector
>     >      w = t(svmModel$coefs)%*%svmModel$SV
>     >
>     >      #compute ranking criteria
>     >      rankingCriteria = w * w
>     >
>     >      #rank the features
>     >      ranking = sort(rankingCriteria, index.return = TRUE)$ix
>     >
>     >      #update feature ranked list
>     >      featureRankedList[rankedFeatureIndex] =
>     > survivingFeaturesIndexes[ranking[1]]
>     >      rankedFeatureIndex = rankedFeatureIndex - 1
>     >
>     >      #eliminate the feature with smallest ranking criterion
>     >      (survivingFeaturesIndexes =
>     survivingFeaturesIndexes[-ranking[1]])}
>     >      return (featureRankedList)}
>     >
>     >
>     >
>     > I tried taking an idea from the above code and incorporate the
>     idea in my
>     > code as shown below
>     >
>     >      library(e1071)
>     >      library(caret)
>     >
>     >      data<- read.csv("matrix.csv", header = TRUE)
>     >
>     >      x <- data
>     >      y <- as.factor(data$Class)
>     >
>     >      svmrfeFeatureRanking = function(x,y){
>     >
>     >        #Checking for the variables
>     >        stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE)
>     >
>     >        n = ncol(x)
>     >        survivingFeaturesIndexes = seq_len(n)
>     >        featureRankedList = vector(length=n)
>     >        rankedFeatureIndex = n
>     >
>     >        while(length(survivingFeaturesIndexes)>0){
>     >          #train the support vector machine
>     >          svmModel = svm(x[, survivingFeaturesIndexes], y,
>     cross=10,cost =
>     > 10, type="C-classification", kernel="linear" )
>     >
>     >          #compute the weight vector
>     >          w = t(svmModel$coefs)%*%svmModel$SV
>     >
>     >          #compute ranking criteria
>     >          rankingCriteria = w * w
>     >
>     >          #rank the features
>     >          ranking = sort(rankingCriteria, index.return = TRUE)$ix
>     >
>     >          #update feature ranked list
>     >          featureRankedList[rankedFeatureIndex] =
>     > survivingFeaturesIndexes[ranking[1]]
>     >          rankedFeatureIndex = rankedFeatureIndex - 1
>     >
>     >          #eliminate the feature with smallest ranking criterion
>     >          (survivingFeaturesIndexes =
>     survivingFeaturesIndexes[-ranking[1]])}
>     >
>     >        return (featureRankedList)}
>     >
>     > But couldn't do anything at the stage "update feature ranked list"
>     > Please guide
>     >
>     >       [[alternative HTML version deleted]]
>     >
>     > ______________________________________________
>     > [hidden email] <mailto:[hidden email]> mailing list
>     -- To UNSUBSCRIBE and more, see
>     > https://stat.ethz.ch/mailman/listinfo/r-help
>     > PLEASE do read the posting guide
>     http://www.R-project.org/posting-guide.html
>     > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Regards,
>
> Priyanka Purkayastha, M.Tech, Ph.D.,
> SERB National Postdoctoral Researcher
> Genomics and Systems Biology Lab,
> Department of Chemical Engineering,
> Indian Institute of Technology Bombay (IITB),
> Powai, Mumbai- 400076
>
>
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Recursive Feature Elimination with SVM

pp2019
This post was updated on .
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Recursive Feature Elimination with SVM

Bert Gunter-2
Note: **NOT** reproducible (only you have "data.csv").

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Jan 1, 2019 at 11:14 PM Priyanka Purkayastha <
[hidden email]> wrote:

> This is the code I tried,
>
> library(e1071)
> library(caret)
> library(ROCR)
>
> data <- read.csv("data.csv", header = TRUE)
> set.seed(998)
>
> inTraining <- createDataPartition(data$Class, p = .70, list = FALSE)
> training <- data[ inTraining,]
> testing  <- data[-inTraining,]
>
> while(length(data)>0){
>
> ## Building the model ####
> svm.model <- svm(Class ~ ., data = training,
>
> cross=10,metric="ROC",type="eps-regression",kernel="linear",na.action=na.omit,probability
> = TRUE)
> print(svm.model)
>
>
> ###### auc  measure #######
>
> #prediction and ROC
> svm.model$index
> svm.pred <- predict(svm.model, testing, probability = TRUE)
>
> #calculating auc
> c <- as.numeric(svm.pred)
> c = c - 1
> pred <- prediction(c, testing$Class)
> perf <- performance(pred,"tpr","fpr")
> plot(perf,fpr.stop=0.1)
> auc <- performance(pred, measure = "auc")
> auc <- [hidden email][[1]]
> print(length(data))
> print(auc)
>
> #compute the weight vector
> w = t(svm.model$coefs)%*%svm.model$SV
>
> #compute ranking criteria
> weight_matrix = w * w
>
> #rank the features
> w_transpose <- t(weight_matrix)
> w2 <- as.matrix(w_transpose[order(w_transpose[,1], decreasing = FALSE),])
> a <- as.matrix(w2[which(w2 == max(w2)),]) #to get the rows with minimum
> values
> row.names(a) -> remove
> training<- data[,setdiff(colnames(data),remove)]
> }
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Wed, Jan 2, 2019 at 11:18 AM David Winsemius <[hidden email]>
> wrote:
>
> >
> > On 1/1/19 5:31 PM, Priyanka Purkayastha wrote:
> > > Thankyou David.. I tried the same, I gave x as the data matrix and y
> > > as the class label. But it returned an empty "featureRankedList". I
> > > get no output when I try the code.
> >
> >
> > If you want people to spend time on this you should post a reproducible
> > example. See the Posting Guide ... and learn to post in plain text.
> >
> >
> > --
> >
> > David
> >
> > >
> > > On Tue, 1 Jan 2019 at 11:42 PM, David Winsemius
> > > <[hidden email] <mailto:[hidden email]>> wrote:
> > >
> > >
> > >     On 1/1/19 4:40 AM, Priyanka Purkayastha wrote:
> > >     > I have a dataset (data) with 700 rows and 7000 columns. I am
> > >     trying to do
> > >     > recursive feature selection with the SVM model. A quick google
> > >     search
> > >     > helped me get a code for a recursive search with SVM. However, I
> > >     am unable
> > >     > to understand the first part of the code, How do I introduce my
> > >     dataset in
> > >     > the code?
> > >
> > >
> > >     Generally the "labels" is given to such a machine learning device
> > >     as the
> > >     y argument, while the "features" are passed as a matrix to the x
> > >     argument.
> > >
> > >
> > >     --
> > >
> > >     David.
> > >
> > >     >
> > >     > If the dataset is a matrix, named data. Please give me an
> > >     example for
> > >     > recursive feature selection with SVM. Bellow is the code I got
> for
> > >     > recursive feature search.
> > >     >
> > >     >      svmrfeFeatureRanking = function(x,y){
> > >     >
> > >     >      #Checking for the variables
> > >     >      stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE)
> > >     >
> > >     >      n = ncol(x)
> > >     >      survivingFeaturesIndexes = seq_len(n)
> > >     >      featureRankedList = vector(length=n)
> > >     >      rankedFeatureIndex = n
> > >     >
> > >     >      while(length(survivingFeaturesIndexes)>0){
> > >     >      #train the support vector machine
> > >     >      svmModel = svm(x[, survivingFeaturesIndexes], y, cost = 10,
> > >     > cachesize=500,
> > >     >                  scale=FALSE, type="C-classification",
> > >     kernel="linear" )
> > >     >
> > >     >      #compute the weight vector
> > >     >      w = t(svmModel$coefs)%*%svmModel$SV
> > >     >
> > >     >      #compute ranking criteria
> > >     >      rankingCriteria = w * w
> > >     >
> > >     >      #rank the features
> > >     >      ranking = sort(rankingCriteria, index.return = TRUE)$ix
> > >     >
> > >     >      #update feature ranked list
> > >     >      featureRankedList[rankedFeatureIndex] =
> > >     > survivingFeaturesIndexes[ranking[1]]
> > >     >      rankedFeatureIndex = rankedFeatureIndex - 1
> > >     >
> > >     >      #eliminate the feature with smallest ranking criterion
> > >     >      (survivingFeaturesIndexes =
> > >     survivingFeaturesIndexes[-ranking[1]])}
> > >     >      return (featureRankedList)}
> > >     >
> > >     >
> > >     >
> > >     > I tried taking an idea from the above code and incorporate the
> > >     idea in my
> > >     > code as shown below
> > >     >
> > >     >      library(e1071)
> > >     >      library(caret)
> > >     >
> > >     >      data<- read.csv("matrix.csv", header = TRUE)
> > >     >
> > >     >      x <- data
> > >     >      y <- as.factor(data$Class)
> > >     >
> > >     >      svmrfeFeatureRanking = function(x,y){
> > >     >
> > >     >        #Checking for the variables
> > >     >        stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE)
> > >     >
> > >     >        n = ncol(x)
> > >     >        survivingFeaturesIndexes = seq_len(n)
> > >     >        featureRankedList = vector(length=n)
> > >     >        rankedFeatureIndex = n
> > >     >
> > >     >        while(length(survivingFeaturesIndexes)>0){
> > >     >          #train the support vector machine
> > >     >          svmModel = svm(x[, survivingFeaturesIndexes], y,
> > >     cross=10,cost =
> > >     > 10, type="C-classification", kernel="linear" )
> > >     >
> > >     >          #compute the weight vector
> > >     >          w = t(svmModel$coefs)%*%svmModel$SV
> > >     >
> > >     >          #compute ranking criteria
> > >     >          rankingCriteria = w * w
> > >     >
> > >     >          #rank the features
> > >     >          ranking = sort(rankingCriteria, index.return = TRUE)$ix
> > >     >
> > >     >          #update feature ranked list
> > >     >          featureRankedList[rankedFeatureIndex] =
> > >     > survivingFeaturesIndexes[ranking[1]]
> > >     >          rankedFeatureIndex = rankedFeatureIndex - 1
> > >     >
> > >     >          #eliminate the feature with smallest ranking criterion
> > >     >          (survivingFeaturesIndexes =
> > >     survivingFeaturesIndexes[-ranking[1]])}
> > >     >
> > >     >        return (featureRankedList)}
> > >     >
> > >     > But couldn't do anything at the stage "update feature ranked
> list"
> > >     > Please guide
> > >     >
> > >     >       [[alternative HTML version deleted]]
> > >     >
> > >     > ______________________________________________
> > >     > [hidden email] <mailto:[hidden email]> mailing list
> > >     -- To UNSUBSCRIBE and more, see
> > >     > https://stat.ethz.ch/mailman/listinfo/r-help
> > >     > PLEASE do read the posting guide
> > >     http://www.R-project.org/posting-guide.html
> > >     > and provide commented, minimal, self-contained, reproducible
> code.
> > >
> > > --
> > > Regards,
> > >
> > > Priyanka Purkayastha, M.Tech, Ph.D.,
> > > SERB National Postdoctoral Researcher
> > > Genomics and Systems Biology Lab,
> > > Department of Chemical Engineering,
> > > Indian Institute of Technology Bombay (IITB),
> > > Powai, Mumbai- 400076
> > >
> > >
> > >
> >
>
>
> --
> Regards,
>
> Priyanka Purkayastha, M.Tech, Ph.D.,
> SERB National Postdoctoral Researcher
> Genomics and Systems Biology Lab,
> Department of Chemical Engineering,
> Indian Institute of Technology Bombay (IITB),
> Powai, Mumbai- 400076
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.