Help with nonlinear least squares regression curve fitting

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Help with nonlinear least squares regression curve fitting

Corey Callaghan
Hi everyone. This is my first post to this forum and I'm hoping someone can help.

I'm trying to finish up some analysis for my thesis and this is the last problem I have. I have calculated data for 15 different species of birds; below is an example of one species and what the data might look like.

I have three a priori nonlinear curves that I want to test each data set against in order to see which of the three curves has the best fit. (I suspect the fit won't be that great for any of them in some instances.)

The curves' functions that I want to test are in the code here (hopefully correctly):

Inverse Quadratic Curve:
fitmodel <- nls(Area ~ (-a*Year)*(Year + b), data = df, start=list(a=??, b=??, c=??))

Sigmodial Curve:
fitmodel <- nls(Area~a/(1+exp(-(b+c*Year))), data=df, start=list(a=???, b=???, c=??))

Double sigmoidal Curve:
fitmodel <- nls(Area~a+2b(1/(1+exp(-abs(-c*Year+d)))-1/2)*sign(-c*Year+d), data=df, start=list(a=???, b=???, c=???)

My problem is I can't really figure out how to choose the correct starting values to avoid getting the singular matrix error. Any help as to how to go about this would be appreciated! Does everything look right? My method is okay?

If I can get the fits to run I plan on using AIC to select the best curve for each of the 15 species.

I thank you in advance for your consideration and help on this!
Cheers,
Corey Callaghan


df:
Area                 Year
104.7181283 1984
32.88026974 1985
56.07395863 1986
191.3422143 1987
233.4661392 1988
57.28317116 1989
201.1273404 1990
34.42570796 1991
165.8962342 1992
58.21905274 1993
114.6643724 1994
342.3461986 1995
184.8877994 1996
94.90509356 1997
45.2026941 1998
68.6196393 1999
575.2440229 2000
519.7557581 2001
904.157509 2002
1107.357517 2003
1682.876061 2004
40.55667824 2005
740.5032604 2006
885.7243469 2007
395.4190968 2008
1031.314519 2009
2597.544987 2010
1316.968695 2011
848.7093901 2012
5076.675075 2013
6132.975491 2014
Reply | Threaded
Open this post in threaded view
|

Re: Help with nonlinear least squares regression curve fitting

Andrew Robinson-6
Finding starting values is a bit of a dark art.  That said, there are steps
you can take, but it may take time.

First, I would scale Year so that it's not in the thousands! Experiment
with subtracting 1980 or so.  For specific advice, see inline.

On Thu, Feb 26, 2015 at 3:03 AM, Corey Callaghan <[hidden email]>
wrote:

> The curves' functions that I want to test are in the code here (hopefully
> correctly):
>
> Inverse Quadratic Curve:
> fitmodel <- nls(Area ~ (-a*Year)*(Year + b), data = df, start=list(a=??,
> b=??, c=??))
>

I would plot the data and a smooth spline, differentiate the curve
function, identify some parameter values somewhere stable, and estimate
some values by eye, or even predict them from the first derivative of the
spline - spline.smooth will do this.

Sigmodial Curve:
> fitmodel <- nls(Area~a/(1+exp(-(b+c*Year))), data=df, start=list(a=???,
> b=???, c=??))
>

I'd use the highest value as a, fit spline as above then invert area at two
times to get b and c.

Double sigmoidal Curve:
> fitmodel <- nls(Area~a+2b(1/(1+exp(-abs(-c*Year+d)))-1/2)*sign(-c*Year+d),
> data=df, start=list(a=???, b=???, c=???)
>

 I'd use min(Area) as a, figure out b from the maximum (I guess 2b+a is the
asymptote), and experiment with two values for year to retrieve c and d
.... uniroot might help?

Cheers

Andrew

--
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955
School of Mathematics and Statistics                        Fax: +61-3-8344
4599
University of Melbourne, VIC 3010 Australia
Email: [hidden email]
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help with nonlinear least squares regression curve fitting

Prof J C Nash (U30A)
In reply to this post by Corey Callaghan
Andrew's suggestion for Year is a help, but package nlmrt shows the
problem you are trying to solve is truly one where there is a Jacobian
singularity. (nlmrt produces the Jacobian singular values -- but read
the output carefully because these are placed for compact output as if
they correspond to parameters, which they do not).

Unfortunately, nlmrt tries to use analytic derivatives, and sign() is
not in the derivatives table for the double sigmoid. BTW, your function
has a typo. Do provide reproducible results. Here is what I did using

callaghan.csv:
Area,Year
104.7181283,1984
32.88026974,1985
56.07395863,1986
191.3422143,1987
233.4661392,1988
57.28317116,1989
201.1273404,1990
34.42570796,1991
165.8962342,1992
58.21905274,1993
114.6643724,1994
342.3461986,1995
184.8877994,1996
94.90509356,1997
45.2026941,1998
68.6196393,1999
575.2440229,2000
519.7557581,2001
904.157509,2002
1107.357517,2003
1682.876061,2004
40.55667824,2005
740.5032604,2006
885.7243469,2007
395.4190968,2008
1031.314519,2009
2597.544987,2010
1316.968695,2011
848.7093901,2012
5076.675075,2013
6132.975491,2014

code:
library(nlmrt)
df <- read.csv("callaghan.csv")

fitmodeliq <- nlxb(Area ~ (-a*Year)*(Year + b), data = df,
start=list(a=1,b=1, c=1))

fitmodelsig <- nlxb(Area~a/(1+exp(-(b+c*Year))), data=df,
start=list(a=1,b=1, c=1))

fitmodelds <- nlxb(Area ~
a+2*b*(1/(1+exp(-abs(-c*Year+d)))-1/2)*sign(-c*Year+d), data=df,
start=list(a=1, b=1, c=1))

For information of readers, Duncan Murdoch and I have been working on
nls14 to replace/augment nls(), but we've a way to go yet before this is
ready for CRAN. Collaborators welcome.

John Nash


On 15-02-26 06:00 AM, [hidden email] wrote:

> Message: 24
> Date: Thu, 26 Feb 2015 07:26:50 +1100
> From: Andrew Robinson <[hidden email]>
> To: Corey Callaghan <[hidden email]>
> Cc: "R help \([hidden email]\)" <[hidden email]>
> Subject: Re: [R] Help with nonlinear least squares regression curve
> fitting
> Message-ID:
> <[hidden email]>
> Content-Type: text/plain; charset="UTF-8"
>
> Finding starting values is a bit of a dark art.  That said, there are steps
> you can take, but it may take time.
>
> First, I would scale Year so that it's not in the thousands! Experiment
> with subtracting 1980 or so.  For specific advice, see inline.
>
> On Thu, Feb 26, 2015 at 3:03 AM, Corey Callaghan <[hidden email]>
> wrote:
>
>> > The curves' functions that I want to test are in the code here (hopefully
>> > correctly):
>> >
>> > Inverse Quadratic Curve:
>> > fitmodel <- nls(Area ~ (-a*Year)*(Year + b), data = df, start=list(a=??,
>> > b=??, c=??))
>> >
> I would plot the data and a smooth spline, differentiate the curve
> function, identify some parameter values somewhere stable, and estimate
> some values by eye, or even predict them from the first derivative of the
> spline - spline.smooth will do this.
>
> Sigmodial Curve:
>> > fitmodel <- nls(Area~a/(1+exp(-(b+c*Year))), data=df, start=list(a=???,
>> > b=???, c=??))
>> >
> I'd use the highest value as a, fit spline as above then invert area at two
> times to get b and c.
>
> Double sigmoidal Curve:
>> > fitmodel <- nls(Area~a+2b(1/(1+exp(-abs(-c*Year+d)))-1/2)*sign(-c*Year+d),
>> > data=df, start=list(a=???, b=???, c=???)
>> >
>  I'd use min(Area) as a, figure out b from the maximum (I guess 2b+a is the
> asymptote), and experiment with two values for year to retrieve c and d
> .... uniroot might help?
>
> Cheers
>
> Andrew
>
> -- Andrew Robinson Deputy Director, CEBRA, School of Biosciences Reader
> & Associate Professor in Applied Statistics Tel: (+61) 0403 138 955
> School of Mathematics and Statistics Fax: +61-3-8344 4599 University of
> Melbourne, VIC 3010 Australia Email: [hidden email]
> Website: http://www.ms.unimelb.edu.au/~andrewpr MSME:
> http://www.crcpress.com/product/isbn/9781439858028 FAwR:
> http://www.ms.unimelb.edu.au/~andrewpr/FAwR/ SPuR:
> http://www.ms.unimelb.edu.au/spuRs/ [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.