

I am trying to fit gamma and exponential distributions using fitdist function in the "fitdistrplus" package to the data I have and obtain the parameters along with the AIC values of the fit. However, I am getting errors with both distributions. I have given an reproducible example with the errors I am getting below. Can someone please let me know how to overcome this issue???
library("fitdistrplus")
test < (895.13582915.7447,335.5472,1470.4022,194.5461,1814.2328,
1056.3067,3110.0783,11441.8656,142.1714,2136.0964,1958.9022,
891.89,352.6939,1341.7042,167.4883,2502.0528,1742.1306,
837.1481,867.8533,3590.4308,1125.9889,1200.605,4321.0011,
1873.9706,323.6633,1912.3147,865.6058,2870.8592,236.7214,
580.2861,350.9269,6842.4969,1886.2403,265.5094,199.9825,
1215.6197,7241.8075,2381.9517,3078.1331,5461.3703,2051.3997,
751.6575,714.3536,598.4539,425.6656,215.2103,608.785,
369.4744,2398.6506,918.6844,525.6925,2549.3694,4108.8983,
2824.0758,1068.7508,249.995,3863.9839,1152.1506,531.6844)
fitdist(test,"gamma",method ="mle")
Error in fitdist(test, "gamma", method = "mle") :
the function mle failed to estimate the parameters,
with the error code 100
In addition: Warning messages:
1: In dgamma(x, shape, scale, log) : NaNs produced
2: In dgamma(x, shape, scale, log) : NaNs produced
3: In dgamma(x, shape, scale, log) : NaNs produced
4: In dgamma(x, shape, scale, log) : NaNs produced
5: In dgamma(x, shape, scale, log) : NaNs produced
6: In dgamma(x, shape, scale, log) : NaNs produced
7: In dgamma(x, shape, scale, log) : NaNs produced
8: In dgamma(x, shape, scale, log) : NaNs produced
9: In dgamma(x, shape, scale, log) : NaNs produced
fitdist(test,"exp",method ="mle")
Error in fitdist(test, "exp", method = "mle") :
the function mle failed to estimate the parameters,
with the error code 100
In addition: Warning message:
In dexp(x, 1/rate, log) : NaNs produced
Thank you.
Ravi


There was a small error in the data creation step and have fixed it as below:
test < c(895.1358,2915.7447,335.5472,1470.4022,194.5461,1814.2328,
1056.3067,3110.0783,11441.8656,142.1714,2136.0964,1958.9022,
891.89,352.6939,1341.7042,167.4883,2502.0528,1742.1306,
837.1481,867.8533,3590.4308,1125.9889,1200.605,4321.0011,
1873.9706,323.6633,1912.3147,865.6058,2870.8592,236.7214,
580.2861,350.9269,6842.4969,1886.2403,265.5094,199.9825,
1215.6197,7241.8075,2381.9517,3078.1331,5461.3703,2051.3997,
751.6575,714.3536,598.4539,425.6656,215.2103,608.785,
369.4744,2398.6506,918.6844,525.6925,2549.3694,4108.8983,
2824.0758,1068.7508,249.995,3863.9839,1152.1506,531.6844)
Any help would be appreciated. Thank you.


Hi,
I am not incredibly knowledgeable about gamma distributions, but
looking at your data, you have a tiny mean:variance ratio, which, I
believe, means that the bulk of the distribution will be near 0 and
you may run into computational problems (again I think. I would
gladly be corrected). This all makes me think it might be a
convergence issue. Perhaps you can transform your data for estimation
and then transform it back (not sure if this would yield equivalent
results)?
fitdist(test + 10^4, "gamma")
fitdist(test/10^4, "gamma")
Good luck,
Josh
On Wed, Apr 27, 2011 at 9:42 PM, vioravis < [hidden email]> wrote:
> There was a small error in the data creation step and have fixed it as below:
>
> test < c(895.1358,2915.7447,335.5472,1470.4022,194.5461,1814.2328,
> 1056.3067,3110.0783,11441.8656,142.1714,2136.0964,1958.9022,
> 891.89,352.6939,1341.7042,167.4883,2502.0528,1742.1306,
> 837.1481,867.8533,3590.4308,1125.9889,1200.605,4321.0011,
> 1873.9706,323.6633,1912.3147,865.6058,2870.8592,236.7214,
> 580.2861,350.9269,6842.4969,1886.2403,265.5094,199.9825,
> 1215.6197,7241.8075,2381.9517,3078.1331,5461.3703,2051.3997,
> 751.6575,714.3536,598.4539,425.6656,215.2103,608.785,
> 369.4744,2398.6506,918.6844,525.6925,2549.3694,4108.8983,
> 2824.0758,1068.7508,249.995,3863.9839,1152.1506,531.6844)
>
> Any help would be appreciated. Thank you.
>
> 
> View this message in context: http://r.789695.n4.nabble.com/FittinggammaandexponentialDistributionswithfitdisttp3477391p3480133.html> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>

Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Joshua, thanks for your reply.
I have tried out the following scaling and it seems to work fine:
scaledVariable < (testmin(test)+0.001)/(max(test)min(test)+0.002)
The gamma distribution parameters are obtained using the scaled variable and samples obtained from this distributions are scaled back using:
scaled < (randomSamples*(max(test)  min(test) + 0.002)) + min(test)  0.001
Is there a better way to scale the variable??? I would prefer fitting a distribution without scaling it.
Thank you.
Ravi


On Wed, 27 Apr 2011, Joshua Wiley wrote:
> Hi,
>
> I am not incredibly knowledgeable about gamma distributions, but
> looking at your data, you have a tiny mean:variance ratio, which, I
> believe, means that the bulk of the distribution will be near 0 and
> you may run into computational problems (again I think. I would
> gladly be corrected).
No, the data are wellfitted by an exponential of mean about 2000.
> This all makes me think it might be a
> convergence issue. Perhaps you can transform your data for estimation
> and then transform it back (not sure if this would yield equivalent
> results)?
>
> fitdist(test + 10^4, "gamma")
No.
> fitdist(test/10^4, "gamma")
Yes. This indeed a scaling issue: the estimated rate is very small.
> Good luck,
>
> Josh
>
>
> On Wed, Apr 27, 2011 at 9:42 PM, vioravis < [hidden email]> wrote:
>> There was a small error in the data creation step and have fixed it as below:
>>
>> test < c(895.1358,2915.7447,335.5472,1470.4022,194.5461,1814.2328,
>> 1056.3067,3110.0783,11441.8656,142.1714,2136.0964,1958.9022,
>> 891.89,352.6939,1341.7042,167.4883,2502.0528,1742.1306,
>> 837.1481,867.8533,3590.4308,1125.9889,1200.605,4321.0011,
>> 1873.9706,323.6633,1912.3147,865.6058,2870.8592,236.7214,
>> 580.2861,350.9269,6842.4969,1886.2403,265.5094,199.9825,
>> 1215.6197,7241.8075,2381.9517,3078.1331,5461.3703,2051.3997,
>> 751.6575,714.3536,598.4539,425.6656,215.2103,608.785,
>> 369.4744,2398.6506,918.6844,525.6925,2549.3694,4108.8983,
>> 2824.0758,1068.7508,249.995,3863.9839,1152.1506,531.6844)
>>
>> Any help would be appreciated. Thank you.
>>
>> 
>> View this message in context: http://r.789695.n4.nabble.com/FittinggammaandexponentialDistributionswithfitdisttp3477391p3480133.html>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>> and provide commented, minimal, selfcontained, reproducible code.
>>
>
>
>
> 
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://www.joshuawiley.com/>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>

Brian D. Ripley, [hidden email]
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


I tried using JMP for the same and get two distinct recommendations when using the unscaled values.
When using the unscaled values, Log Normal appears to be best fit. fitdist in R is unable to provide a fit in this case.
Compare Distributions
Show Distribution Number of Parameters 2*LogLikelihood AICc
X LogNormal 2 1016.29587 1020.50639
Johnson Sl 3 1015.21183 1021.6404
GLog 3 1016.29587 1022.72444
Exponential 1 1021.58662 1023.65559
Johnson Su 4 1015.21183 1023.9391
Gamma 2 1021.02475 1025.23528
Weibull 2 1021.50762 1025.71815
Extreme Value 2 1021.50762 1025.71815
Normal 2 Mixture 5 1042.55455 1053.66566
Normal 3 Mixture 8 1042.74433 1061.56786
Normal 2 1082.36992 1086.58045
However, when using the scaled values, Gamma appears to be best fit. I am getting the same using R as well.
Compare Distributions
Show Distribution Number of Parameters 2*LogLikelihood AICc
X Gamma 2 114.92911 110.71858
Weibull 2 113.54302 109.3325
Extreme Value 2 113.54302 109.3325
Exponential 1 108.01019 105.94122
Johnson Sl 3 104.69191 98.263335
Johnson Su 4 104.69191 95.964634
GLog 3 102.35037 95.921798
LogNormal 2 70.727608 66.517082
Normal 2 Mixture 5 77.349192 66.238081
Normal 3 Mixture 8 77.159407 58.335878
Normal 2 37.533813 33.323287
What is the difference between the MLE methods in JMP and R??? Is it advisable to go with the scaled values in R???
Thank you.
Ravi

