linear functional relationships with heteroscedastic & non-Gaussian errors - any packages around?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

linear functional relationships with heteroscedastic & non-Gaussian errors - any packages around?

Jarle Brinchmann-2
[apologies if this appears twice]

Hi,

I have a situation where I have a set of pairs of X & Y variables for
each of which I have a (fairly) well-defined PDF. The PDF(x_i) 's and
PDF(y_i)'s  are unfortunately often rather non-Gaussian although most
of the time not multi-modal.

For these data (estimates of gas content in galaxies), I need to
quantify a linear functional relationship and I am trying to do this
as carefully as I can. At the moment I am carrying out a Monte Carlo
estimation, sampling from each PDF(x_i) and PDF(y_i) and using a
orthogonal linear fit for each realisation but that is not very
satisfactory as it leads to different linear relationships depending
on whether I do the orhtogonal fit on x or y (as the errors on X & Y
are quite different & non-Gaussian using the covariance matrix isn't
all that useful
either)

Does anybody know of code in R to do this kind of fitting in a
Bayesian framework? My concern isn't so much on getting _the_ best
slope estimate but rather to have a good estimate of the uncertainty
on the slope.

      Cheers,
            Jarle.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: linear functional relationships with heteroscedastic & non-Gaussian errors - any packages around?

Prof Brian Ripley
I wonder if you are using this term in its correct technical sense.
A linear functional relationship is

V = a + bU
X = U + e
Y = V + f

e and f are random errors (often but not necessarily independent) with
distributions possibly depending on U and V respectively.

and pairs from (X,Y) are observed.  As U and V are not random, there is
no PDF of X or Y: each X_i has a different distribution.  If you know
the error distribution for each X_i and Y_i, you can easily write down a
log-likelihood and maximize it.

The first hit I got on Google,
http://www.rsc.org/Membership/Networking/InterestGroups/Analytical/AMC/Software/FREML.asp,

has a reference to a paper for the Gaussian case.

But finding R code for the non-Gaussian case seems a very long shot.

Jarle Brinchmann wrote:
> [apologies if this appears twice]

It did ...

>
> Hi,
>
> I have a situation where I have a set of pairs of X & Y variables for
> each of which I have a (fairly) well-defined PDF. The PDF(x_i) 's and
> PDF(y_i)'s  are unfortunately often rather non-Gaussian although most
> of the time not multi-modal.
>
> For these data (estimates of gas content in galaxies), I need to
> quantify a linear functional relationship and I am trying to do this
> as carefully as I can. At the moment I am carrying out a Monte Carlo
> estimation, sampling from each PDF(x_i) and PDF(y_i) and using a
> orthogonal linear fit for each realisation but that is not very
> satisfactory as it leads to different linear relationships depending
> on whether I do the orhtogonal fit on x or y (as the errors on X & Y
> are quite different & non-Gaussian using the covariance matrix isn't
> all that useful
> either)
>
> Does anybody know of code in R to do this kind of fitting in a
> Bayesian framework? My concern isn't so much on getting _the_ best
> slope estimate but rather to have a good estimate of the uncertainty
> on the slope.
>
>       Cheers,
>             Jarle.


--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: linear functional relationships with heteroscedastic & non-Gaussian errors - any packages around?

Prof Brian Ripley
In reply to this post by Jarle Brinchmann-2
I wonder if you are using this term in its correct technical sense.
A linear functional relationship is

V = a + bU
X = U + e
Y = V + f

e and f are random errors (often but not necessarily independent) with
distributions possibly depending on U and V respectively.

and pairs from (X,Y) are observed.  As U and V are not random, there is
no PDF of X or Y: each X_i has a different distribution.  If you know
the error distribution for each X_i and Y_i, you can easily write down a
log-likelihood and maximize it.

The first hit I got on Google,
http://www.rsc.org/Membership/Networking/InterestGroups/Analytical/AMC/Software/FREML.asp,

has a reference to a paper for the Gaussian case.

But finding R code for the non-Gaussian case seems a very long shot.

Jarle Brinchmann wrote:
> [apologies if this appears twice]

It did ...

>
> Hi,
>
> I have a situation where I have a set of pairs of X & Y variables for
> each of which I have a (fairly) well-defined PDF. The PDF(x_i) 's and
> PDF(y_i)'s  are unfortunately often rather non-Gaussian although most
> of the time not multi-modal.
>
> For these data (estimates of gas content in galaxies), I need to
> quantify a linear functional relationship and I am trying to do this
> as carefully as I can. At the moment I am carrying out a Monte Carlo
> estimation, sampling from each PDF(x_i) and PDF(y_i) and using a
> orthogonal linear fit for each realisation but that is not very
> satisfactory as it leads to different linear relationships depending
> on whether I do the orhtogonal fit on x or y (as the errors on X & Y
> are quite different & non-Gaussian using the covariance matrix isn't
> all that useful
> either)
>
> Does anybody know of code in R to do this kind of fitting in a
> Bayesian framework? My concern isn't so much on getting _the_ best
> slope estimate but rather to have a good estimate of the uncertainty
> on the slope.
>
>       Cheers,
>             Jarle.


--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: linear functional relationships with heteroscedastic & non-Gaussian errors - any packages around?

Jarle Brinchmann-2
Thanks for the reply!

On Tue, Dec 2, 2008 at 6:34 PM, Prof Brian Ripley <[hidden email]> wrote:
> I wonder if you are using this term in its correct technical sense.
> A linear functional relationship is
>
> V = a + bU
> X = U + e
> Y = V + f
>
> e and f are random errors (often but not necessarily independent) with
> distributions possibly depending on U and V respectively.

This is indeed what I mean, poor phrasing of me. What I have is the
effectively the PDF for e & f for each instance, and I wish to get a &
b. For Gaussian errors there are certainly various ways to approach it
and the maximum-likelihood estimator is fine and is what I normally
use when my errors are sort-of-normal.

However in this instance my uncertainty estimates are strongly
non-Gaussian and even defining the mode of the distribution becomes
rather iffy so  I really prefer to sample the likelihoods properly.
Using the maximum-likelihood estimator naively in this case is not
terribly useful and I have no idea what the derived confidence limits
"means".

Ah well, I guess what I have to do at the moment is to use brute force
and try to calculate P(a,b|X,Y) properly using a marginalisation over
U (I hadn't done that before, I expect that was part of my problem).
Hopefully that will give reasonable uncertainty estimates, lots of
pain for a figure nobody will look at for much time :)

                 Thanks,
                     Jarle.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: linear functional relationships with heteroscedastic & non-Gaussian errors - any packages around?

Spencer Graves
      Isn't this a special case of structural equation modeling, handled
by the 'sem' package?

      Spencer

Jarle Brinchmann wrote:

> Thanks for the reply!
>
> On Tue, Dec 2, 2008 at 6:34 PM, Prof Brian Ripley <[hidden email]> wrote:
>  
>> I wonder if you are using this term in its correct technical sense.
>> A linear functional relationship is
>>
>> V = a + bU
>> X = U + e
>> Y = V + f
>>
>> e and f are random errors (often but not necessarily independent) with
>> distributions possibly depending on U and V respectively.
>>    
>
> This is indeed what I mean, poor phrasing of me. What I have is the
> effectively the PDF for e & f for each instance, and I wish to get a &
> b. For Gaussian errors there are certainly various ways to approach it
> and the maximum-likelihood estimator is fine and is what I normally
> use when my errors are sort-of-normal.
>
> However in this instance my uncertainty estimates are strongly
> non-Gaussian and even defining the mode of the distribution becomes
> rather iffy so  I really prefer to sample the likelihoods properly.
> Using the maximum-likelihood estimator naively in this case is not
> terribly useful and I have no idea what the derived confidence limits
> "means".
>
> Ah well, I guess what I have to do at the moment is to use brute force
> and try to calculate P(a,b|X,Y) properly using a marginalisation over
> U (I hadn't done that before, I expect that was part of my problem).
> Hopefully that will give reasonable uncertainty estimates, lots of
> pain for a figure nobody will look at for much time :)
>
>                  Thanks,
>                      Jarle.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: linear functional relationships with heteroscedastic & non-Gaussian errors - any packages around?

Jarle Brinchmann-2
Yes I think so if the errors were normally distributed. Unfortunately
I'm far from that but the combination of sem & its bootstrap is a good
way to deal with it in the normal case.

I must admit as a non-statistician I'm a not 100% sure what the
difference (if there is one) between a linear functional relationship
and a linear structural equation model is as they both deal with
hidden variables as far as I can see.

            J.

On Tue, Dec 2, 2008 at 9:33 PM, Spencer Graves <[hidden email]> wrote:

>     Isn't this a special case of structural equation modeling, handled by
> the 'sem' package?
>     Spencer
>
> Jarle Brinchmann wrote:
>>
>> Thanks for the reply!
>>
>> On Tue, Dec 2, 2008 at 6:34 PM, Prof Brian Ripley <[hidden email]>
>> wrote:
>>
>>>
>>> I wonder if you are using this term in its correct technical sense.
>>> A linear functional relationship is
>>>
>>> V = a + bU
>>> X = U + e
>>> Y = V + f
>>>
>>> e and f are random errors (often but not necessarily independent) with
>>> distributions possibly depending on U and V respectively.
>>>
>>
>> This is indeed what I mean, poor phrasing of me. What I have is the
>> effectively the PDF for e & f for each instance, and I wish to get a &
>> b. For Gaussian errors there are certainly various ways to approach it
>> and the maximum-likelihood estimator is fine and is what I normally
>> use when my errors are sort-of-normal.
>>
>> However in this instance my uncertainty estimates are strongly
>> non-Gaussian and even defining the mode of the distribution becomes
>> rather iffy so  I really prefer to sample the likelihoods properly.
>> Using the maximum-likelihood estimator naively in this case is not
>> terribly useful and I have no idea what the derived confidence limits
>> "means".
>>
>> Ah well, I guess what I have to do at the moment is to use brute force
>> and try to calculate P(a,b|X,Y) properly using a marginalisation over
>> U (I hadn't done that before, I expect that was part of my problem).
>> Hopefully that will give reasonable uncertainty estimates, lots of
>> pain for a figure nobody will look at for much time :)
>>
>>                 Thanks,
>>                     Jarle.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: linear functional relationships with heteroscedastic & non-Gaussian errors - any packages around?

Prof Brian Ripley
On Tue, 2 Dec 2008, Jarle Brinchmann wrote:

> Yes I think so if the errors were normally distributed. Unfortunately
> I'm far from that but the combination of sem & its bootstrap is a good
> way to deal with it in the normal case.
>
> I must admit as a non-statistician I'm a not 100% sure what the
> difference (if there is one) between a linear functional relationship
> and a linear structural equation model is as they both deal with
> hidden variables as far as I can see.

U and V are not 'variables' (not random variables) in a linear functional
relationship (they are in the closely related linear structural
relationship).

>
>            J.
>
> On Tue, Dec 2, 2008 at 9:33 PM, Spencer Graves <[hidden email]> wrote:
>>     Isn't this a special case of structural equation modeling, handled by
>> the 'sem' package?
>>     Spencer
>>
>> Jarle Brinchmann wrote:
>>>
>>> Thanks for the reply!
>>>
>>> On Tue, Dec 2, 2008 at 6:34 PM, Prof Brian Ripley <[hidden email]>
>>> wrote:
>>>
>>>>
>>>> I wonder if you are using this term in its correct technical sense.
>>>> A linear functional relationship is
>>>>
>>>> V = a + bU
>>>> X = U + e
>>>> Y = V + f
>>>>
>>>> e and f are random errors (often but not necessarily independent) with
>>>> distributions possibly depending on U and V respectively.
>>>>
>>>
>>> This is indeed what I mean, poor phrasing of me. What I have is the
>>> effectively the PDF for e & f for each instance, and I wish to get a &
>>> b. For Gaussian errors there are certainly various ways to approach it
>>> and the maximum-likelihood estimator is fine and is what I normally
>>> use when my errors are sort-of-normal.
>>>
>>> However in this instance my uncertainty estimates are strongly
>>> non-Gaussian and even defining the mode of the distribution becomes
>>> rather iffy so  I really prefer to sample the likelihoods properly.
>>> Using the maximum-likelihood estimator naively in this case is not
>>> terribly useful and I have no idea what the derived confidence limits
>>> "means".
>>>
>>> Ah well, I guess what I have to do at the moment is to use brute force
>>> and try to calculate P(a,b|X,Y) properly using a marginalisation over
>>> U (I hadn't done that before, I expect that was part of my problem).
>>> Hopefully that will give reasonable uncertainty estimates, lots of
>>> pain for a figure nobody will look at for much time :)
>>>
>>>                 Thanks,
>>>                     Jarle.
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.