Fitting a Mixture of Noncentral Student t Distributions to a one-dimensional sample

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Fitting a Mixture of Noncentral Student t Distributions to a one-dimensional sample

Johannes Moser
Dear R community,

I`d like to extract the parameters of a two-component mixture
distribution of noncentral student t distributions which was fitted to a
one-dimensional sample.

There are many packages for R that are capable of handling mixture
distributions in one way or another. Some in the context of a Bayesian
framework requiring kernels. Some in a regression framework. Some in a
nonparametric framework. ...

So far the "mixdist"-package seems to come closest to my wish. This
package fits parametric mixtures to a sample of data. Unfortunately it
doesn`t support the student t distribution.

I have also tried to manually set up a likelihood function as described
here:
http://stackoverflow.com/questions/6485597/r-how-to-fit-a-large-dataset-with-a-combination-of-distributions
But the result is far from perfect.

The "gamlss.mx"-package might be helping, but originally it seems to be
set up for another context, i.e. regression. I tried to regress my data
on a constant and then extract the parameters for the estimated mixture
error distribution. But the estimated parameters seem to be not directly
accessable individually by some command (such as fit1$sigma). And there
seem to be serious convergence problems even in pretty simple and
nonambiguous cases (see example 2). The following syntax is my
gamlss.mx-setup so far:


     library(gamlss.dist)
     library(gamlss.mx)
     library(MASS)

     # 1:
     data(geyser)
     plot(density(geyser$waiting) )
     fit1 <- gamlssMX(waiting~1,data=geyser,family="TF",K=2)
     fit1
     # works fine

     # 2:
     N <- 100000
     components <- sample(1:2,prob=c(0.6,0.4),size=N,replace=TRUE)
     mus <- c(3,-6)
     sds <- c(1,9)
     nus <- c(25,3)
     mixsim <-
data.frame(rTF2(N,mu=mus[components],sigma=sds[components],nu=nus[components]))
     colnames(mixsim) <- "MCsim"
     plot(density(mixsim$MCsim) , xlim=c(-50,50))
     fit2 <- gamlssMX(MCsim~1,data=mixsim,family="TF",K=2)
     fit2
     # no convergence

With another dataset and when using the same two component densities for
the mixture as above I ended up with negative estimates for sigma (which
should be positive).

I would be very grateful for any advice. I`ve read through many manuals
and vignettes today but it seems that I am nearly in the same place
where I was this morning.
A small example for a setup that works sort of reliably would be fantastic!

Thanks a lot in advance!!
Johannes

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Fitting a Mixture of Noncentral Student t Distributions to a one-dimensional sample

Ingmar Visser
Hi Johannes,
Below code gives good results for me; note that trying multiple
starting is often important in fitting mixture models, even in simple
cases like this.
Note also that the sigma and nu parameters in gamlssMX are fitted on a
log scale, hence the possible occurrence of negative results.
hth, Ingmar

    # 2:
    N <- 2000
    components <- sample(1:2,prob=c(0.6,0.4),size=N,replace=TRUE)
    mus <- c(3,-6)
    sds <- c(1,9)
    nus <- c(25,3)
    mixsim <- data.frame(rTF2(N,mu=mus[components],sigma=sds[components],nu=nus[components]))
    colnames(mixsim) <- "MCsim"
    plot(density(mixsim$MCsim) , xlim=c(-50,50))

set.seed(2)
    fit2 <- gamlssMX(MCsim~1,data=mixsim,family="TF",K=2)
    fit2

On Wed, Apr 30, 2014 at 10:30 PM, Johannes Moser <[hidden email]> wrote:

> Dear R community,
>
> I`d like to extract the parameters of a two-component mixture distribution
> of noncentral student t distributions which was fitted to a one-dimensional
> sample.
>
> There are many packages for R that are capable of handling mixture
> distributions in one way or another. Some in the context of a Bayesian
> framework requiring kernels. Some in a regression framework. Some in a
> nonparametric framework. ...
>
> So far the "mixdist"-package seems to come closest to my wish. This package
> fits parametric mixtures to a sample of data. Unfortunately it doesn`t
> support the student t distribution.
>
> I have also tried to manually set up a likelihood function as described
> here:
> http://stackoverflow.com/questions/6485597/r-how-to-fit-a-large-dataset-with-a-combination-of-distributions
> But the result is far from perfect.
>
> The "gamlss.mx"-package might be helping, but originally it seems to be set
> up for another context, i.e. regression. I tried to regress my data on a
> constant and then extract the parameters for the estimated mixture error
> distribution. But the estimated parameters seem to be not directly
> accessable individually by some command (such as fit1$sigma). And there seem
> to be serious convergence problems even in pretty simple and nonambiguous
> cases (see example 2). The following syntax is my gamlss.mx-setup so far:
>
>
>     library(gamlss.dist)
>     library(gamlss.mx)
>     library(MASS)
>
>     # 1:
>     data(geyser)
>     plot(density(geyser$waiting) )
>     fit1 <- gamlssMX(waiting~1,data=geyser,family="TF",K=2)
>     fit1
>     # works fine
>
>     # 2:
>     N <- 100000
>     components <- sample(1:2,prob=c(0.6,0.4),size=N,replace=TRUE)
>     mus <- c(3,-6)
>     sds <- c(1,9)
>     nus <- c(25,3)
>     mixsim <-
> data.frame(rTF2(N,mu=mus[components],sigma=sds[components],nu=nus[components]))
>     colnames(mixsim) <- "MCsim"
>     plot(density(mixsim$MCsim) , xlim=c(-50,50))
>     fit2 <- gamlssMX(MCsim~1,data=mixsim,family="TF",K=2)
>     fit2
>     # no convergence
>
> With another dataset and when using the same two component densities for the
> mixture as above I ended up with negative estimates for sigma (which should
> be positive).
>
> I would be very grateful for any advice. I`ve read through many manuals and
> vignettes today but it seems that I am nearly in the same place where I was
> this morning.
> A small example for a setup that works sort of reliably would be fantastic!
>
> Thanks a lot in advance!!
> Johannes
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Restricted fitting of two-component mixture distribution in R possible?

Johannes Moser
In reply to this post by Johannes Moser
Dear list,

In fitting a two-component Student's t mixture distribution to some data
(standardized GARCH residuals) one of the components has an estimated
degree of freedom of 0.6. This means that even the first moment of the
mixture distribution would not exist.

The gamlss.mx package in R is used for estimation. gamlss.control,
glim.control and MX.control seem not to support this kind of option --
or I couldn't find out how...

Is there a way to restrict the degree of freedom parameter estimates to
be larger than, say, 3?

Many thanks in advance,
Johannes

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.