function censReg in panel data setting

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

function censReg in panel data setting

Igors
Hello all,

I have a problem estimating Random Effects model using censReg function.

small part of code:

UpC <- censReg(Power ~ Windspeed, left = -Inf, right = 2000,data=PData_In,method="BHHH",nGHQ = 4)

Error in maxNRCompute(fn = logLikAttr, fnOrig = fn, gradOrig = grad, hessOrig = hess,  :
  NA in the initial gradient


...then I tried to set starting values myself and here is the error what I got:

UpC <- censReg(Power ~ Windspeed, left = -Inf, right = 2000,data=PData_In,method="BHHH",nGHQ = 4,start=c(-691,189,5))

Error in names(start) <- c(colnames(xMat), "logSigmaMu", "logSigmaNu") :
  'names' attribute [4] must be the same length as the vector [3]


How can I solve any of these errors?


Thanks in advance!


Best,
Igors

Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Arne Henningsen-4
Dear Igors

On 5 September 2011 23:58, Igors <[hidden email]> wrote:

> I have a problem estimating Random Effects model using censReg function.
>
> small part of code:
>
> UpC <- censReg(Power ~ Windspeed, left = -Inf, right =
> 2000,data=PData_In,method="BHHH",nGHQ = 4)
>
> Error in maxNRCompute(fn = logLikAttr, fnOrig = fn, gradOrig = grad,
> hessOrig = hess,  :
>  NA in the initial gradient
>
>
> ...then I tried to set starting values myself and here is the error what I
> got:
>
> UpC <- censReg(Power ~ Windspeed, left = -Inf, right =
> 2000,data=PData_In,method="BHHH",nGHQ = 4,start=c(-691,189,5))
>
> Error in names(start) <- c(colnames(xMat), "logSigmaMu", "logSigmaNu") :
>  'names' attribute [4] must be the same length as the vector [3]
>
>
> How can I solve any of these errors?

You chose a suitable solution for the first problem (NA in initial
gradient). Unfortunately, the documentation of "censReg" is not very
clear regarding starting values of panel tobit models. Please note
that you have to specify 4 starting values: intercept, slope
parameter, variance of the individual effects ("mu"), and variance of
the general error term ("nu").

http://cran.r-project.org/web/packages/censReg/vignettes/censReg.pdf

Best wishes from Copenhagen,
Arne

--
Arne Henningsen
http://www.arne-henningsen.name

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Igors
>You chose a suitable solution for the first problem (NA in initial
>gradient). Unfortunately, the documentation of "censReg" is not very
>clear regarding starting values of panel tobit models. Please note
>that you have to specify 4 starting values: intercept, slope
>parameter, variance of the individual effects ("mu"), and variance of
>the general error term ("nu").


I have already tried to use 4 parameters, here is what I get:

> UpC <- censReg(Power ~ Windspeed, left = -Inf, right = 2000,data=PData_In,method="BHHH",nGHQ = 4,start=c(-691,189,5,5))

Error in censReg(Power ~ Windspeed, left = -Inf, right = 2000, data = PData_In,  :
  argument 'start' must have length 3


Any ideas how to overcome this one? Could this be an error in package or censReg function?


All the best,
Igors


Best wishes from Copenhagen,
Arne

--
Arne Henningsen
http://www.arne-henningsen.name

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Arne Henningsen-4
On 6 September 2011 07:51, Igors <[hidden email]> wrote:

>>You chose a suitable solution for the first problem (NA in initial
>>gradient). Unfortunately, the documentation of "censReg" is not very
>>clear regarding starting values of panel tobit models. Please note
>>that you have to specify 4 starting values: intercept, slope
>>parameter, variance of the individual effects ("mu"), and variance of
>>the general error term ("nu").
>
>
> I have already tried to use 4 parameters, here is what I get:
>
>> UpC <- censReg(Power ~ Windspeed, left = -Inf, right =
>> 2000,data=PData_In,method="BHHH",nGHQ = 4,start=c(-691,189,5,5))
>
> Error in censReg(Power ~ Windspeed, left = -Inf, right = 2000, data =
> PData_In,  :
>  argument 'start' must have length 3
>
>
> Any ideas how to overcome this one? Could this be an error in package or
> censReg function?

Yes, you are right. This is (was) a bug in the censReg function/package.

I have fixed the package. The source code of the new version (0.5-6)
is available on R-Forge [1]; R packages will be available on CRAN [2]
and R-Forge [3] probably within one or two days.

[1] https://r-forge.r-project.org/scm/?group_id=256
[2] http://cran.r-project.org/package=censReg
[3] https://r-forge.r-project.org/R/?group_id=256

/Arne

--
Arne Henningsen
http://www.arne-henningsen.name

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Igors
Dear Arne,

Thank you for fixing the package.

However I am still struggling to obtain model estmates.

The same code:

> UpC <- censReg(Power ~ Windspeed, left = -Inf, right = 2000,data=PData_In,method="BHHH",nGHQ = 4,start=c(-691.18,186.79,3.9,3.9))

Error in maxNRCompute(fn = logLikAttr, fnOrig = fn, gradOrig = grad, hessOrig = hess,  :
  NA in the initial gradient


I have tried to change starting values and regressors in the model several times, but I always get the same mentioned error message. How can I make it work?  Is this function maxNRCompute() on the last step of calculation (maximization of the ML)?

I had an idea that the error could appear since I have huge sample, but I tried to cut it and it still doesn't work.


Best wishes,
Igors



 
Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Arne Henningsen-4
Dear Igors

On 7 September 2011 10:39, Igors <[hidden email]> wrote:

> However I am still struggling to obtain model estmates.
>
> The same code:
>
>> UpC <- censReg(Power ~ Windspeed, left = -Inf, right =
>> 2000,data=PData_In,method="BHHH",nGHQ = 4,start=c(-691.18,186.79,3.9,3.9))
>
> Error in maxNRCompute(fn = logLikAttr, fnOrig = fn, gradOrig = grad,
> hessOrig = hess,  :
>  NA in the initial gradient
>
>
> I have tried to change starting values and regressors in the model several
> times, but I always get the same mentioned error message. How can I make it
> work?  Is this function maxNRCompute() on the last step of calculation
> (maximization of the ML)?
>
> I had an idea that the error could appear since I have huge sample, but I
> tried to cut it and it still doesn't work.

It is hard to figure out the cause of this error without a
reproducible example. Is is possible that you send a reproducible
example to me?

Could it be that there are NAs in the data or something in the panel
data specification is not as censReg() expects it?

/Arne

--
Arne Henningsen
http://www.arne-henningsen.name

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Igors
Dear Arne,

I have sent you an e-mail to your e-mail at gmail.com. There I have attached my data set and the code for this particular problem. I hope to hear from you soon.

Many thanks in advance!


Igors
Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Igors

Does censReg expect from panel data to be balanced? Because in my case it is unbalanced. Could this be a reason for errors?


Best,
Igors
Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Igors
In reply to this post by Arne Henningsen-4
Dear Arne,

Does censReg support unbalanced panel data?



/Igors
Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Arne Henningsen-4
In reply to this post by Igors
On 8 September 2011 09:56, Igors <[hidden email]> wrote:
> Does censReg expect from panel data to be balanced?

No. censReg() can estimate models with unbalanced panel data. The
estimation in the code that you sent me indeed does not work but if I
remove the user-specified starting values (argument "start"), it works
fine.

R> summary(UpC)

Call:
censReg(formula = Power ~ Windspeed, left = -Inf, right = 2000,
    data = PData_In, nGHQ = 4, method = "BHHH", iterlim = 200)

Observations:
         Total  Left-censored     Uncensored Right-censored
           874              0            847             27

Coefficients:
              Estimate Std. error t value Pr(> t)
(Intercept) -462.26676   19.89517  -23.23  <2e-16 ***
Windspeed    188.38796    1.72492  109.22  <2e-16 ***
logSigmaMu     4.54053    0.03352  135.45  <2e-16 ***
logSigmaNu     5.08733    0.02706  188.00  <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

BHHH maximisation, 131 iterations
Return code 2: successive function values within tolerance limit
Log-likelihood: -5538.124 on 4 Df


/Arne


--
Arne Henningsen
http://www.arne-henningsen.name

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Igors
Dear Arne,


Thank you for your answer. However it doesn't solve my problem fully.

The problem is that I have much bigger data set than I sent to you (it was only a small part : 874 obs.). My full data set is 546718 obs.

If I try to use censReg on full data set, then it still gives me the same already mentioned error about Na's in the initial gradient.


I have sent you an e-mail with full dataset and the code. I would really appreciate if you could check how it works and suggest me how to solve this problem.

I have noticed that you use "iterlim" argumet to specify maximum number of iterations.  How this argument could affect possibility of obtaining estimates?



/Igors
Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Arne Henningsen-4
Dear Igors

On 9 September 2011 22:00, Igors <[hidden email]> wrote:

> Thank you for your answer. However it doesn't solve my problem fully.
>
> The problem is that I have much bigger data set than I sent to you (it was
> only a small part : 874 obs.). My full data set is 546718 obs.
>
> If I try to use censReg on full data set, then it still gives me the same
> already mentioned error about Na's in the initial gradient.
>
>
> I have sent you an e-mail with full dataset and the code.

Thanks!

> I would really
> appreciate if you could check how it works and suggest me how to solve this
> problem.

Now, I get the same error message. The problem is that the
log-likelihood function becomes minus infinity. I guess that this
problem occurs, because a likelihood contribution of (at least) one
individual observation is so small that it is rounded zero and hence,
its logarithm is minus infinity. However, I have to take a deeper look
into this issue before I can give a definite answer and hopefully find
a solution.

> I have noticed that you use "iterlim" argumet to specify maximum number of
> iterations.  How this argument could affect possibility of obtaining
> estimates?

Using your small data set, the maximization of the (log) likelihood
function in censReg() did not converge within 100 iterations.
Therefore, I increased the maximum number of iterations from 100 to
200 -- and then the maximization converged :-(

/Arne

--
Arne Henningsen
http://www.arne-henningsen.name

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Igors
Hi Arne,

Any success in finding possible solutions for my problem?


I have tried to experiment with size of sample and I get really bad picture. I can't get it work even if sample is ~ 1000 obs. And it is way less than I would like to see working, taking into account my full sample size ~ 540 000 obs.


Thanks for your help and let me know when you have something. ;)


/Igors
Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Arne Henningsen-4
Hi Igors

On 13 September 2011 13:27, Igors <[hidden email]> wrote:
> Any success in finding possible solutions for my problem?

Somewhat. The calculation of the log-likelihood values is numerically
much more robust/stable now. The log-likelihood contributions of some
individuals became minus infinity in your model. This was caused by
rounding errors as illustrated in the following simplified example:

log( exp( a ) + exp( b ) )

If a and b become smaller than approximately -800, exp( a ) and exp( b
) are rounded to zero and the log of their sum (zero) is minus
infinity.
I have solved this problem by replacing the above calculation by

log( exp( a - c ) + exp( b - c ) ) + c
with c = max( a, b )

The source code of the improved censReg package is available on
R-Forge [1]; R packages will be available on R-Forge [2] probably
within one day.

[1] https://r-forge.r-project.org/scm/?group_id=256
[2] https://r-forge.r-project.org/R/?group_id=256

Unfortunately, the calculation of the gradients is still not robust
but I expect that I can solve this problem in a similar way as I used
to solve the problem with the likelihood function itself. I will
continue working on this.

> I have tried to experiment with size of sample and I get really bad picture.
> I can't get it work even if sample is ~ 1000 obs. And it is way less than I
> would like to see working, taking into account my full sample size ~ 540 000
> obs.

I hope that you have a very fast computer -- or a lot of time for
waiting many days or even a few weeks.

/Arne


--
Arne Henningsen
http://www.arne-henningsen.name

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Arne Henningsen-4
On 14 September 2011 00:36, Arne Henningsen
<[hidden email]> wrote:

> Hi Igors
>
> On 13 September 2011 13:27, Igors <[hidden email]> wrote:
>> Any success in finding possible solutions for my problem?
>
> Somewhat. The calculation of the log-likelihood values is numerically
> much more robust/stable now. The log-likelihood contributions of some
> individuals became minus infinity in your model. This was caused by
> rounding errors as illustrated in the following simplified example:
>
> log( exp( a ) + exp( b ) )
>
> If a and b become smaller than approximately -800, exp( a ) and exp( b
> ) are rounded to zero and the log of their sum (zero) is minus
> infinity.
> I have solved this problem by replacing the above calculation by
>
> log( exp( a - c ) + exp( b - c ) ) + c
> with c = max( a, b )
>
> The source code of the improved censReg package is available on
> R-Forge [1]; R packages will be available on R-Forge [2] probably
> within one day.
>
> [1] https://r-forge.r-project.org/scm/?group_id=256
> [2] https://r-forge.r-project.org/R/?group_id=256
>
> Unfortunately, the calculation of the gradients is still not robust
> but I expect that I can solve this problem in a similar way as I used
> to solve the problem with the likelihood function itself. I will
> continue working on this.
>
>> I have tried to experiment with size of sample and I get really bad picture.
>> I can't get it work even if sample is ~ 1000 obs. And it is way less than I
>> would like to see working, taking into account my full sample size ~ 540 000
>> obs.
>
> I hope that you have a very fast computer -- or a lot of time for
> waiting many days or even a few weeks.

Now I have also improved the numerical stability of the calculation of
the *gradients* of the log-likelihood function. Basically, the problem
was that in an equation similar to

( exp(a1) * b1 + exp(a2) * b2 ) / exp(d)

a1, a2, and d could have values less than -800 so that exp(a1),
exp(a2), and exp(d) became zero and hence, the entire equation became
zero divided by zero ("NaN") in specific cases. As a1, a2, and d are
usually of similar size, I could solve this problem by re-arranging
the above equation to

exp(a1-d) * b1 + exp(a2-d) * b2

I hope that the numerical stability is sufficiently large now so that
you can estimate your large model. Please let me know if it works now.

Again, the source code of the improved censReg package is available on
R-Forge [1]; R packages will be available on R-Forge [2] probably
within one day. Please note that you need at least revision 1207.

[1] https://r-forge.r-project.org/scm/?group_id=256
[2] https://r-forge.r-project.org/R/?group_id=256

Best regards,
Arne

--
Arne Henningsen
http://www.arne-henningsen.name

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

felipnunes
Hi guys,

did you figure out a way to solve your problem, Igor?

I'm still trying to understand what's happening to my estimations. My data set is also big (>50,000) and I get the same error:

Error in maxNRCompute(fn = logLikAttr, fnOrig = fn, gradOrig = grad, hessOrig = hess,  :
  NA in the initial gradient

I already tried many different specifications and the error is always the same. I'm fitting the following model:

tob7 <- censReg(transfers.cap ~ pt.pt + psdb.pt + pt.opp + pt.coa + psdb.coa + pib.cap + transfers.cap.lag + pib.cap + ifdm + log(populat) + mayor.vot.per + log(bol.fam.per+0.01) + factor(uf.name) + factor(year) - 1, left=0, right=Inf, method="BHHH", nGHQ=8, iterlim=10000, data = pdata2)

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: function censReg in panel data setting

Arne Henningsen-4
On 24 September 2011 23:58, felipnunes <[hidden email]> wrote:

> Hi guys,
>
> did you figure out a way to solve your problem, Igor?
>
> I'm still trying to understand what's happening to my estimations. My data
> set is also big (>50,000) and I get the same error:
>
> Error in maxNRCompute(fn = logLikAttr, fnOrig = fn, gradOrig = grad,
> hessOrig = hess,  :
>  NA in the initial gradient
>
> I already tried many different specifications and the error is always the
> same. I'm fitting the following model:
>
> tob7 <- censReg(transfers.cap ~ pt.pt + psdb.pt + pt.opp + pt.coa + psdb.coa
> + pib.cap + transfers.cap.lag + pib.cap + ifdm + log(populat) +
> mayor.vot.per + log(bol.fam.per+0.01) + factor(uf.name) + factor(year) - 1,
> left=0, right=Inf, method="BHHH", nGHQ=8, iterlim=10000, data = pdata2)

Did you upgrade to version 0.5-7 (revision >= 1207) from R-Forge?

/Arne

--
Arne Henningsen
http://www.arne-henningsen.name

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.