Glmnet survival cox predict

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Glmnet survival cox predict

amirhad
Hi all,
i'm trying to get the prediction probabilities for a survival elastic net.
When i use try to predict using the train model on the test set, it creates
an object with the number rows of the train data (6400 rows) instead of the
test data (2400 rows). I really don't understand why, and that doesn't let
me check for performance c-index.
the code:

data<-read.csv("old4.csv", header=TRUE)
library(imputeMissings)
data<-impute(data,object = NULL ,method = "median/mode")

trainstatus<-train$DIED1095
trainTime<-train$TIME
y<-Surv(trainTime,trainstatus)

trainX<-train[-c(12,63,64,65,66,67,68,69,70,71)]
x<-data.matrix(trainX)


library(glmnet)
fit <- glmnet(x,Surv(trainTime,trainstatus),family="cox",alpha=0.1,
,maxit=10000)
max.dev.index     <- which.max(fit$dev.ratio)
optimal.lambda <- fit$lambda[max.dev.index]
optimal.beta  <- fit$beta[,max.dev.index]
nonzero.coef <- abs(optimal.beta)>0
selectedBeta <- optimal.beta[nonzero.coef]
selectedTrainX   <- x[,nonzero.coef]

coxph.model<- coxph(Surv(train$TIME,train$DIED365) ~x,data=train,
init=selectedBeta,iter=0)
coxph.predict<-predict(coxph.model,test)

nrow(test)
2872

nrow(train
6701

length(coxph.predict)
6701

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Glmnet survival cox predict

David Winsemius

On 11/15/19 10:49 AM, Amir Hadanny wrote:
> Hi all,
> i'm trying to get the prediction probabilities for a survival elastic net.
> When i use try to predict using the train model on the test set, it creates
> an object with the number rows of the train data (6400 rows) instead of the
> test data (2400 rows). I really don't understand why, and that doesn't let
> me check for performance c-index.


If you call most `predict` functions with a second argument that fails
to contain the predictors in the model, it returns the predictions on
the original data. The only place where the `test` object appears prior
to the predict operation is in your call to `predict.coxph`, so my guess
is that it fails to meet the requirements of the function for a valid
newdata argument. (Another thought was that maybe `test` didn't exist,
but that should have thrown an error with the predict call and the nrow
call.)


But since you don't provide code that creates `test` or even an
unambiguous way of examining its structure, that is entirely a guess.


And finally ... Rhelp is a plain text mailing list, so please to read
the message at the bottom of every transmission from the mailserver ...
i.e.  read the Posting Guide. (It is not at all difficult to get
gmail.com to send plain text.)


--

David.

> the code:
>
> data<-read.csv("old4.csv", header=TRUE)
> library(imputeMissings)
> data<-impute(data,object = NULL ,method = "median/mode")
>
> trainstatus<-train$DIED1095
> trainTime<-train$TIME
> y<-Surv(trainTime,trainstatus)
>
> trainX<-train[-c(12,63,64,65,66,67,68,69,70,71)]
> x<-data.matrix(trainX)
>
>
> library(glmnet)
> fit <- glmnet(x,Surv(trainTime,trainstatus),family="cox",alpha=0.1,
> ,maxit=10000)
> max.dev.index     <- which.max(fit$dev.ratio)
> optimal.lambda <- fit$lambda[max.dev.index]
> optimal.beta  <- fit$beta[,max.dev.index]
> nonzero.coef <- abs(optimal.beta)>0
> selectedBeta <- optimal.beta[nonzero.coef]
> selectedTrainX   <- x[,nonzero.coef]
>
> coxph.model<- coxph(Surv(train$TIME,train$DIED365) ~x,data=train,
> init=selectedBeta,iter=0)
> coxph.predict<-predict(coxph.model,test)
>
> nrow(test)
> 2872
>
> nrow(train
> 6701
>
> length(coxph.predict)
> 6701
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Glmnet survival cox predict

amirhad
Thank you,
both train and test are originated from the same data object.

attached the missing code:

data<-read.csv("old4.csv", header=TRUE)

library(imputeMissings)
data<-impute(data,object = NULL ,method = "median/mode")

for (i in col[13:68]) {
  data[i]<-lapply(data[i], factor)
}
for (i in col[1:12]) {
  data[i]<-lapply(data[i], numeric)
}

data$TIME<-as.numeric(data$TIME)

  data<-data[-c(61,62,64,65,66,67,68)]
data$TIME<-ceiling(data$TIME/12)
data$TIME[which(data$TIME==37)]<-36

data1 = sort(sample(nrow(data), nrow(data)*.7))
train<-data[data1,]
test<-data[-data1,]


so test should be the exact same, and i still can't find the issue,

thank you
Amir

On Sat, Nov 16, 2019 at 12:00 AM David Winsemius <[hidden email]>
wrote:

>
> On 11/15/19 10:49 AM, Amir Hadanny wrote:
> > Hi all,
> > i'm trying to get the prediction probabilities for a survival elastic
> net.
> > When i use try to predict using the train model on the test set, it
> creates
> > an object with the number rows of the train data (6400 rows) instead of
> the
> > test data (2400 rows). I really don't understand why, and that doesn't
> let
> > me check for performance c-index.
>
>
> If you call most `predict` functions with a second argument that fails
> to contain the predictors in the model, it returns the predictions on
> the original data. The only place where the `test` object appears prior
> to the predict operation is in your call to `predict.coxph`, so my guess
> is that it fails to meet the requirements of the function for a valid
> newdata argument. (Another thought was that maybe `test` didn't exist,
> but that should have thrown an error with the predict call and the nrow
> call.)
>
>
> But since you don't provide code that creates `test` or even an
> unambiguous way of examining its structure, that is entirely a guess.
>
>
> And finally ... Rhelp is a plain text mailing list, so please to read
> the message at the bottom of every transmission from the mailserver ...
> i.e.  read the Posting Guide. (It is not at all difficult to get
> gmail.com to send plain text.)
>
>
> --
>
> David.
>
> > the code:
> >
> > data<-read.csv("old4.csv", header=TRUE)
> > library(imputeMissings)
> > data<-impute(data,object = NULL ,method = "median/mode")
> >
> > trainstatus<-train$DIED1095
> > trainTime<-train$TIME
> > y<-Surv(trainTime,trainstatus)
> >
> > trainX<-train[-c(12,63,64,65,66,67,68,69,70,71)]
> > x<-data.matrix(trainX)
> >
> >
> > library(glmnet)
> > fit <- glmnet(x,Surv(trainTime,trainstatus),family="cox",alpha=0.1,
> > ,maxit=10000)
> > max.dev.index     <- which.max(fit$dev.ratio)
> > optimal.lambda <- fit$lambda[max.dev.index]
> > optimal.beta  <- fit$beta[,max.dev.index]
> > nonzero.coef <- abs(optimal.beta)>0
> > selectedBeta <- optimal.beta[nonzero.coef]
> > selectedTrainX   <- x[,nonzero.coef]
> >
> > coxph.model<- coxph(Surv(train$TIME,train$DIED365) ~x,data=train,
> > init=selectedBeta,iter=0)
> > coxph.predict<-predict(coxph.model,test)
> >
> > nrow(test)
> > 2872
> >
> > nrow(train
> > 6701
> >
> > length(coxph.predict)
> > 6701
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.