Distribution Fitting

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Distribution Fitting

Lorenzo Isella
 Dear All,
I am experiencing some problems in fitting a trimodal distribution (which
should be thought as a sum of three Gaussian distributions) using the nls
function for nonlinear fittings.
As a test, just consider the very simple code:

rm(list=ls());
mydata<-rnorm(10000,0,4);
mydens<-density(mydata,kernel="gaussian");
y1<-mydens$y;
x1<-mydens$x;
myfit<-nls(y1~A*exp(-x1^2/sig));


which I use to get the empirical density (as I would from real experimental
data) and test it against a Gaussian ansatz.
Well, either R always crashes for a segmentation fault and I have to restart
it manually or I get this output:

Error in match.call(definition, call, expand.dots) :
'.Primitiv�i�d�������������...' is not a function

Am I missing the obvious or is there some bug in my R build?
Many thanks

Lorenzo

        [[alternative HTML version deleted]]


_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
Reply | Threaded
Open this post in threaded view
|

Re: Distribution Fitting

Dirk Eddelbuettel

On 24 April 2006 at 00:10, Lorenzo Isella wrote:
|  Dear All,
| I am experiencing some problems in fitting a trimodal distribution (which
| should be thought as a sum of three Gaussian distributions) using the nls
| function for nonlinear fittings.
| As a test, just consider the very simple code:
|
| rm(list=ls());
| mydata<-rnorm(10000,0,4);
| mydens<-density(mydata,kernel="gaussian");
| y1<-mydens$y;
| x1<-mydens$x;
| myfit<-nls(y1~A*exp(-x1^2/sig));
|
|
| which I use to get the empirical density (as I would from real experimental
| data) and test it against a Gaussian ansatz.
| Well, either R always crashes for a segmentation fault and I have to restart
| it manually or I get this output:
|
| Error in match.call(definition, call, expand.dots) :
| '.Primitiv�i�d�������������...' is not a function
|
| Am I missing the obvious or is there some bug in my R build?

Are you aware of fitdistr() in MASS?

> library(MASS)
> mydata<-rnorm(10000,0,4)
> fitdistr(mydata, "normal")
        mean                  sd            
  -0.0055755191185632    3.9956609772436904
 ( 0.0399566097724369) ( 0.0282535897233148)
>

fitdistr() fits a bunch of distribution. I can't recall whether I used this,
or one of the mixed distribution packages from CRAN when I was doing some
work on mixtures for fat tails.

Dirk

--
Hell, there are no rules here - we're trying to accomplish something.
                                                  -- Thomas A. Edison

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
Reply | Threaded
Open this post in threaded view
|

Re: Distribution Fitting

Krishna Kumar-2
mixdist is what you want it works very well and gets you nice gaussian
mixtures.

The issue is with calibration of these mixtures and I would like to be
shown otherwise..or just wrong!

But if you have three distribution there could be two different
combination of distribution parameters that can fit the data equally well.
This becomes a problem of how to divvy up say the variance between two
of the gaussians..and how good an optimizer you have..

Krishna




Dirk Eddelbuettel wrote:

>On 24 April 2006 at 00:10, Lorenzo Isella wrote:
>|  Dear All,
>| I am experiencing some problems in fitting a trimodal distribution (which
>| should be thought as a sum of three Gaussian distributions) using the nls
>| function for nonlinear fittings.
>| As a test, just consider the very simple code:
>|
>| rm(list=ls());
>| mydata<-rnorm(10000,0,4);
>| mydens<-density(mydata,kernel="gaussian");
>| y1<-mydens$y;
>| x1<-mydens$x;
>| myfit<-nls(y1~A*exp(-x1^2/sig));
>|
>|
>| which I use to get the empirical density (as I would from real experimental
>| data) and test it against a Gaussian ansatz.
>| Well, either R always crashes for a segmentation fault and I have to restart
>| it manually or I get this output:
>|
>| Error in match.call(definition, call, expand.dots) :
>| '.Primitiv�i�d�������������...' is not a function
>|
>| Am I missing the obvious or is there some bug in my R build?
>
>Are you aware of fitdistr() in MASS?
>
>  
>
>>library(MASS)
>>mydata<-rnorm(10000,0,4)
>>fitdistr(mydata, "normal")
>>    
>>
>        mean                  sd            
>  -0.0055755191185632    3.9956609772436904
> ( 0.0399566097724369) ( 0.0282535897233148)
>  
>
>
>fitdistr() fits a bunch of distribution. I can't recall whether I used this,
>or one of the mixed distribution packages from CRAN when I was doing some
>work on mixtures for fat tails.
>
>Dirk
>
>  
>

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
Reply | Threaded
Open this post in threaded view
|

Re: Distribution Fitting

Lorenzo Isella
In reply to this post by Lorenzo Isella
On Mon, 2006-04-24 at 15:08 +0200, Martin Maechler wrote:

> >>>>> "Lorenzo" == Lorenzo Isella <[hidden email]>
> >>>>>     on Mon, 24 Apr 2006 00:10:41 +0200 writes:
>
>     Lorenzo> Dear All,
>     Lorenzo> I am experiencing some problems in fitting a trimodal distribution (which
>     Lorenzo> should be thought as a sum of three Gaussian distributions) using the nls
>     Lorenzo> function for nonlinear fittings.
>     Lorenzo> As a test, just consider the very simple code:
>
>     Lorenzo> rm(list=ls());
>     Lorenzo> mydata<-rnorm(10000,0,4);
>     Lorenzo> mydens<-density(mydata,kernel="gaussian");
>     Lorenzo> y1<-mydens$y;
>     Lorenzo> x1<-mydens$x;
>     Lorenzo> myfit<-nls(y1~A*exp(-x1^2/sig));
>
> (it's slightly inefficient and completely unnecessary to append
>  an empty statement to every line -- which you
>  do incidentally if you add superfluous ';' at end of lines )
>
>
>     Lorenzo> which I use to get the empirical density (as I would from real experimental
>     Lorenzo> data) and test it against a Gaussian ansatz.
>     Lorenzo> Well, either R always crashes for a segmentation fault and I have to restart
>     Lorenzo> it manually or I get this output:
>
>     Lorenzo> Error in match.call(definition, call, expand.dots) :
>     Lorenzo> '.Primitiv�i�d�������������...' is not a function
>
> I cannot confirm any segmentation fault (and if you really get
> them, indeed your R *installation* is buggy (or outdated)), but
> indeed, you've revealed a buglet in nls() - which is still
> present in R 2.3.0 which has been released a few hours ago.
> Thank you for reporting it as a reproducible example!
>
> I'm going to try to fix the bug.
>
>     Lorenzo> Am I missing the obvious or is there some bug in my R build?
>
> As others have said (implicitly): you're missing the facts
>
> -  that it's rather bad idea to fit "data" to a density estimate.
> -  if your density has a closed from, you should rather use
>    maximum likelihood, typically most easily  via MASS::fitdistr
>
> Martin

Thanks everybody for their advice.
Granted that I had better use fitdistr for my particular problem, how do
I feed a particular density function of my own choice into it?
E.g., consider the case of Gaussian-distributed data and suppose there
is no "normal" argument for the fitdistr().

mysample<-rnorm(10000,2,4)

You want to model the data as drawn from the probability distribution
A*exp(-(x-mu)^2/sig), where x is an experimental data and have A, mu and
sigma as parameters to fit.
I read the help(fitdistr) page, but there is not an example with a
user-provided probability density and I did not get very far with my
attempts.
Best regards and thanks in advance for your help

Lorenzo

_______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance