Re: Efficient way to code using optim()

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Efficient way to code using optim()

parkbomee

Hi all,
 
I am trying to estimate a simple logit model.
By using MLE, I am maximizing the log likelihood, with optim().
The thing is, each observation has different set of choice options, so I need a loop inside the objective function,
which I think slows down the optimization process.
 
The data is constructed so that each row represent the characteristics for one alternative,
and CS is a variable that represents choice situations. (say, 1 ~ Number of observations)
cum_count is the ¡°cumulative¡± count of each choice situations, i.e. number of available alternatives in each CS.
So I am maximizing the sum of [exp(U(chosen)) / sum(exp(U(all alternatives)))]
 
When I have 6,7 predictors, the running time is about 10 minutes, and it slows down exponentially as I have more predictors. (More theta¡¯s to estimate)
I want to know if there is a way I can improve the running time.
Below is my code..
 
simple_logit = function(theta){
                realized_prob = rep(0, max(data$CS))
                theta_multiple = as.matrix(data[,4:35]) %*% as.matrix(theta)
                realized_prob[1] = exp(theta_multiple[1]) / sum(exp(theta_multiple[1:cum_count[1]]))
                for (i in 2:length(realized_prob)){
                                realized_prob[i] = exp(theta_multiple[cum_count[(i-1)]+1]) / sum(exp(theta_multiple[((cum_count[(i-1)]+1):cum_count[i])]))
                                }
                -sum(log(realized_prob))
}
 
initial = rep(0,32)
out33 = optim(initial, simple_logit, method="BFGS", hessian=TRUE)
 
 
 
Many thanks in advance!!!    
_________________________________________________________________


        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Efficient way to code using optim()

Giovanni Petris

Unless this is a homework problem, you would be much better off using
glm().

Giovanni

> Date: Fri, 30 Oct 2009 12:23:45 -0700
> From: parkbomee <[hidden email]>
> Sender: [hidden email]
> Importance: Normal
> Precedence: list
>
>
> --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q)
> Content-type: text/plain
> Content-transfer-encoding: 8BIT
> Content-disposition: inline
> Content-length: 1692
>
>
> Hi all,
>  
> I am trying to estimate a simple logit model.
> By using MLE, I am maximizing the log likelihood, with optim().
> The thing is, each observation has different set of choice options, so I need a loop inside the objective function,
> which I think slows down the optimization process.
>  
> The data is constructed so that each row represent the characteristics for one alternative,
> and CS is a variable that represents choice situations. (say, 1 ~ Number of observations)
> cum_count is the ¡°cumulative¡± count of each choice situations, i.e. number of available alternatives in each CS.
> So I am maximizing the sum of [exp(U(chosen)) / sum(exp(U(all alternatives)))]
>  
> When I have 6,7 predictors, the running time is about 10 minutes, and it slows down exponentially as I have more predictors. (More theta¡¯s to estimate)
> I want to know if there is a way I can improve the running time.
> Below is my code..
>  
> simple_logit = function(theta){
>                 realized_prob = rep(0, max(data$CS))
>                 theta_multiple = as.matrix(data[,4:35]) %*% as.matrix(theta)
>                 realized_prob[1] = exp(theta_multiple[1]) / sum(exp(theta_multiple[1:cum_count[1]]))
>                 for (i in 2:length(realized_prob)){
>                                 realized_prob[i] = exp(theta_multiple[cum_count[(i-1)]+1]) / sum(exp(theta_multiple[((cum_count[(i-1)]+1):cum_count[i])]))
>                                 }
>                 -sum(log(realized_prob))
> }
>  
> initial = rep(0,32)
> out33 = optim(initial, simple_logit, method="BFGS", hessian=TRUE)
>  
>  
>  
> Many thanks in advance!!!    
> _________________________________________________________________
>
>
> [[alternative HTML version deleted]]
>
>
> --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q)
> MIME-version: 1.0
> Content-type: text/plain; charset=us-ascii
> Content-transfer-encoding: 7BIT
> Content-disposition: inline
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q)--
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Efficient way to code using optim()

parkbomee

Thank you.

But I'd prefer using a written function which allows me more flexible model specification.
Later on, I could have random parameters.
So I want to know if there is any more efficient way so that I can speed it up.


> Date: Fri, 30 Oct 2009 16:10:29 -0600
> To: [hidden email]
> CC: [hidden email]
> Subject: Re: [R] Efficient way to code using optim()
> From: [hidden email]
>
>
> Unless this is a homework problem, you would be much better off using
> glm().
>
> Giovanni
>
> > Date: Fri, 30 Oct 2009 12:23:45 -0700
> > From: parkbomee <[hidden email]>
> > Sender: [hidden email]
> > Importance: Normal
> > Precedence: list
> >
> >
> > --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q)
> > Content-type: text/plain
> > Content-transfer-encoding: 8BIT
> > Content-disposition: inline
> > Content-length: 1692
> >
> >
> > Hi all,
> >  
> > I am trying to estimate a simple logit model.
> > By using MLE, I am maximizing the log likelihood, with optim().
> > The thing is, each observation has different set of choice options, so I need a loop inside the objective function,
> > which I think slows down the optimization process.
> >  
> > The data is constructed so that each row represent the characteristics for one alternative,
> > and CS is a variable that represents choice situations. (say, 1 ~ Number of observations)
> > cum_count is the ¡°cumulative¡± count of each choice situations, i.e. number of available alternatives in each CS.
> > So I am maximizing the sum of [exp(U(chosen)) / sum(exp(U(all alternatives)))]
> >  
> > When I have 6,7 predictors, the running time is about 10 minutes, and it slows down exponentially as I have more predictors. (More theta¡¯s to estimate)
> > I want to know if there is a way I can improve the running time.
> > Below is my code..
> >  
> > simple_logit = function(theta){
> >                 realized_prob = rep(0, max(data$CS))
> >                 theta_multiple = as.matrix(data[,4:35]) %*% as.matrix(theta)
> >                 realized_prob[1] = exp(theta_multiple[1]) / sum(exp(theta_multiple[1:cum_count[1]]))
> >                 for (i in 2:length(realized_prob)){
> >                                 realized_prob[i] = exp(theta_multiple[cum_count[(i-1)]+1]) / sum(exp(theta_multiple[((cum_count[(i-1)]+1):cum_count[i])]))
> >                                 }
> >                 -sum(log(realized_prob))
> > }
> >  
> > initial = rep(0,32)
> > out33 = optim(initial, simple_logit, method="BFGS", hessian=TRUE)
> >  
> >  
> >  
> > Many thanks in advance!!!    
> > _________________________________________________________________
> >
> >
> > [[alternative HTML version deleted]]
> >
> >
> > --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q)
> > MIME-version: 1.0
> > Content-type: text/plain; charset=us-ascii
> > Content-transfer-encoding: 7BIT
> > Content-disposition: inline
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> > --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q)--
> >
> >
     
_________________________________________________________________
나의 글로벌 인맥, Windows Live Space!
http://www.spaces.live.com
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.