Fitting gamma and exponential Distributions with fitdist

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Fitting gamma and exponential Distributions with fitdist

vioravis
I am trying to fit gamma and exponential distributions using fitdist function in the "fitdistrplus" package to the data I have and obtain the parameters along with the AIC values of the fit. However, I am getting errors with both distributions. I have given an reproducible example with the errors I am getting below. Can someone please let me know how to overcome this issue???

library("fitdistrplus")
test <- (895.13582915.7447,335.5472,1470.4022,194.5461,1814.2328,
1056.3067,3110.0783,11441.8656,142.1714,2136.0964,1958.9022,
891.89,352.6939,1341.7042,167.4883,2502.0528,1742.1306,
837.1481,867.8533,3590.4308,1125.9889,1200.605,4321.0011,
1873.9706,323.6633,1912.3147,865.6058,2870.8592,236.7214,
580.2861,350.9269,6842.4969,1886.2403,265.5094,199.9825,
1215.6197,7241.8075,2381.9517,3078.1331,5461.3703,2051.3997,
751.6575,714.3536,598.4539,425.6656,215.2103,608.785,
369.4744,2398.6506,918.6844,525.6925,2549.3694,4108.8983,
2824.0758,1068.7508,249.995,3863.9839,1152.1506,531.6844)

fitdist(test,"gamma",method ="mle")

Error in fitdist(test, "gamma", method = "mle") :
  the function mle failed to estimate the parameters,
                with the error code 100
In addition: Warning messages:
1: In dgamma(x, shape, scale, log) : NaNs produced
2: In dgamma(x, shape, scale, log) : NaNs produced
3: In dgamma(x, shape, scale, log) : NaNs produced
4: In dgamma(x, shape, scale, log) : NaNs produced
5: In dgamma(x, shape, scale, log) : NaNs produced
6: In dgamma(x, shape, scale, log) : NaNs produced
7: In dgamma(x, shape, scale, log) : NaNs produced
8: In dgamma(x, shape, scale, log) : NaNs produced
9: In dgamma(x, shape, scale, log) : NaNs produced


fitdist(test,"exp",method ="mle")

Error in fitdist(test, "exp", method = "mle") :
  the function mle failed to estimate the parameters,
                with the error code 100
In addition: Warning message:
In dexp(x, 1/rate, log) : NaNs produced

Thank you.
Ravi
Reply | Threaded
Open this post in threaded view
|

Re: Fitting gamma and exponential Distributions with fitdist

vioravis
There was a small error in the data creation step and have fixed it as below:

test <- c(895.1358,2915.7447,335.5472,1470.4022,194.5461,1814.2328,
1056.3067,3110.0783,11441.8656,142.1714,2136.0964,1958.9022,
891.89,352.6939,1341.7042,167.4883,2502.0528,1742.1306,
837.1481,867.8533,3590.4308,1125.9889,1200.605,4321.0011,
1873.9706,323.6633,1912.3147,865.6058,2870.8592,236.7214,
580.2861,350.9269,6842.4969,1886.2403,265.5094,199.9825,
1215.6197,7241.8075,2381.9517,3078.1331,5461.3703,2051.3997,
751.6575,714.3536,598.4539,425.6656,215.2103,608.785,
369.4744,2398.6506,918.6844,525.6925,2549.3694,4108.8983,
2824.0758,1068.7508,249.995,3863.9839,1152.1506,531.6844)

Any help would be appreciated. Thank you.
Reply | Threaded
Open this post in threaded view
|

Re: Fitting gamma and exponential Distributions with fitdist

Joshua Wiley-2
Hi,

I am not incredibly knowledgeable about gamma distributions, but
looking at your data, you have a tiny mean:variance ratio, which, I
believe, means that the bulk of the distribution will be near 0 and
you may run into computational problems (again I think.  I would
gladly be corrected).  This all makes me think it might be a
convergence issue.  Perhaps you can transform your data for estimation
and then transform it back (not sure if this would yield equivalent
results)?

fitdist(test + 10^4, "gamma")
fitdist(test/10^4, "gamma")

Good luck,

Josh


On Wed, Apr 27, 2011 at 9:42 PM, vioravis <[hidden email]> wrote:

> There was a small error in the data creation step and have fixed it as below:
>
> test <- c(895.1358,2915.7447,335.5472,1470.4022,194.5461,1814.2328,
> 1056.3067,3110.0783,11441.8656,142.1714,2136.0964,1958.9022,
> 891.89,352.6939,1341.7042,167.4883,2502.0528,1742.1306,
> 837.1481,867.8533,3590.4308,1125.9889,1200.605,4321.0011,
> 1873.9706,323.6633,1912.3147,865.6058,2870.8592,236.7214,
> 580.2861,350.9269,6842.4969,1886.2403,265.5094,199.9825,
> 1215.6197,7241.8075,2381.9517,3078.1331,5461.3703,2051.3997,
> 751.6575,714.3536,598.4539,425.6656,215.2103,608.785,
> 369.4744,2398.6506,918.6844,525.6925,2549.3694,4108.8983,
> 2824.0758,1068.7508,249.995,3863.9839,1152.1506,531.6844)
>
> Any help would be appreciated. Thank you.
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Fitting-gamma-and-exponential-Distributions-with-fitdist-tp3477391p3480133.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Fitting gamma and exponential Distributions with fitdist

vioravis
Joshua, thanks for your reply.

I have tried out the following scaling and it seems to work fine:

scaledVariable <- (test-min(test)+0.001)/(max(test)-min(test)+0.002)  

The gamma distribution parameters are obtained using the scaled variable and samples obtained from this distributions are scaled back using:

scaled <- (randomSamples*(max(test) - min(test) + 0.002)) + min(test) - 0.001

Is there a better way to scale the variable???  I would prefer fitting a distribution without scaling it.

Thank you.

Ravi
Reply | Threaded
Open this post in threaded view
|

Re: Fitting gamma and exponential Distributions with fitdist

Prof Brian Ripley
In reply to this post by Joshua Wiley-2
On Wed, 27 Apr 2011, Joshua Wiley wrote:

> Hi,
>
> I am not incredibly knowledgeable about gamma distributions, but
> looking at your data, you have a tiny mean:variance ratio, which, I
> believe, means that the bulk of the distribution will be near 0 and
> you may run into computational problems (again I think.  I would
> gladly be corrected).

No, the data are well-fitted by an exponential of mean about 2000.

>  This all makes me think it might be a
> convergence issue.  Perhaps you can transform your data for estimation
> and then transform it back (not sure if this would yield equivalent
> results)?
>
> fitdist(test + 10^4, "gamma")

No.

> fitdist(test/10^4, "gamma")

Yes.  This indeed a scaling issue: the estimated rate is very small.


> Good luck,
>
> Josh
>
>
> On Wed, Apr 27, 2011 at 9:42 PM, vioravis <[hidden email]> wrote:
>> There was a small error in the data creation step and have fixed it as below:
>>
>> test <- c(895.1358,2915.7447,335.5472,1470.4022,194.5461,1814.2328,
>> 1056.3067,3110.0783,11441.8656,142.1714,2136.0964,1958.9022,
>> 891.89,352.6939,1341.7042,167.4883,2502.0528,1742.1306,
>> 837.1481,867.8533,3590.4308,1125.9889,1200.605,4321.0011,
>> 1873.9706,323.6633,1912.3147,865.6058,2870.8592,236.7214,
>> 580.2861,350.9269,6842.4969,1886.2403,265.5094,199.9825,
>> 1215.6197,7241.8075,2381.9517,3078.1331,5461.3703,2051.3997,
>> 751.6575,714.3536,598.4539,425.6656,215.2103,608.785,
>> 369.4744,2398.6506,918.6844,525.6925,2549.3694,4108.8983,
>> 2824.0758,1068.7508,249.995,3863.9839,1152.1506,531.6844)
>>
>> Any help would be appreciated. Thank you.
>>
>> --
>> View this message in context: http://r.789695.n4.nabble.com/Fitting-gamma-and-exponential-Distributions-with-fitdist-tp3477391p3480133.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://www.joshuawiley.com/
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Fitting gamma and exponential Distributions with fitdist

vioravis
I tried using JMP for the same and get two distinct recommendations when using the unscaled values.

When using the unscaled values, Log Normal appears to be best fit. fitdist in R is unable to provide a fit in this case.

Compare Distributions
 
Show Distribution Number of Parameters -2*LogLikelihood AICc  
X LogNormal 2 1016.29587 1020.50639  
  Johnson Sl 3 1015.21183 1021.6404  
  GLog 3 1016.29587 1022.72444  
  Exponential 1 1021.58662 1023.65559  
  Johnson Su 4 1015.21183 1023.9391  
  Gamma 2 1021.02475 1025.23528  
  Weibull 2 1021.50762 1025.71815  
  Extreme Value 2 1021.50762 1025.71815  
  Normal 2 Mixture 5 1042.55455 1053.66566  
  Normal 3 Mixture 8 1042.74433 1061.56786  
  Normal 2 1082.36992 1086.58045


However, when using the scaled values, Gamma appears to be best fit. I am getting the same using R as well.

Compare Distributions
 
Show Distribution Number of Parameters -2*LogLikelihood AICc  
X Gamma 2 -114.92911 -110.71858  
  Weibull 2 -113.54302 -109.3325  
  Extreme Value 2 -113.54302 -109.3325  
  Exponential 1 -108.01019 -105.94122  
  Johnson Sl 3 -104.69191 -98.263335  
  Johnson Su 4 -104.69191 -95.964634  
  GLog 3 -102.35037 -95.921798  
  LogNormal 2 -70.727608 -66.517082  
  Normal 2 Mixture 5 -77.349192 -66.238081  
  Normal 3 Mixture 8 -77.159407 -58.335878  
  Normal 2 -37.533813 -33.323287


What is the difference between the MLE methods in JMP and R??? Is it advisable to go with the scaled values in R???

Thank you.

Ravi