|
This post was updated on .
Hi,everyone,
I am studying the generalized additive model and employ the package 'mgcv' developed by professor Wood. However,I can not understand the example listed in choose.in function. For example, library(mgcv) set.seed(1) dat <- gamSim(1,n=400,scale=2) ## fit a GAM with quite low `k' b<-gam(y~s(x0,k=6)+s(x1,k=6)+s(x2,k=6)+s(x3,k=6),data=dat) plot(b,pages=1,residuals=TRUE) ## hint of a problem in s(x2) ## the following suggests a problem with s(x2) gam.check(b) ## Another approach (see below for more obvious method).... ## check for residual pattern, removeable by increasing `k' ## typically `k', below, chould be substantially larger than ## the original, `k' but certainly less than n/2. ## Note use of cheap "cs" shrinkage smoothers, and gamma=1.4 ## to reduce chance of overfitting... rsd <- residuals(b) gam(rsd~s(x0,k=40,bs="cs"),gamma=1.4,data=dat) ## fine gam(rsd~s(x1,k=40,bs="cs"),gamma=1.4,data=dat) ## fine gam(rsd~s(x2,k=40,bs="cs"),gamma=1.4,data=dat) ## `k' too low gam(rsd~s(x3,k=40,bs="cs"),gamma=1.4,data=dat) ## fine why the model is not good for x2? > gam(rsd~s(x2,k=40,bs="cs"),gamma=1.4,data=dat) ## `k' too low Family: gaussian Link function: identity Formula: rsd ~ s(x2, k = 40, bs = "cs") Estimated degrees of freedom: 9.0093 total = 10.00926 GCV score: 4.494652 For the results,we can see that the EDF is much less than K-1,so according to "If the effective degrees of freedom for a model term are estimated to be much less than k-1 then this is unlikely to be very worthwhile",I think the results are reasonable. Why? Thanks in advance wanhai |
|
The point is that you are checking the basis dimension used in the first
model, b, where the basis dimension for s(x2) was set to 6. All the other model fits are about checking that first one. On checking the residuals from model b you detect pattern with respect to x2, with an estimated degrees of freedom of 9, which is bigger than the maximum possible employed in model b. So model b is probably using too small a basis dimension for s(x2). best, Simon On 06/21/2012 02:07 AM, ywh123 wrote: > Hi,everyone, > I am studying the generalized additive model and employ the package 'mgcv' > developed by professor Wood. > However,I can not understand the example listed in check.in function. > For example, > > > library(mgcv) > set.seed(1) > dat<- gamSim(1,n=400,scale=2) > > ## fit a GAM with quite low `k' > b<-gam(y~s(x0,k=6)+s(x1,k=6)+s(x2,k=6)+s(x3,k=6),data=dat) > plot(b,pages=1,residuals=TRUE) ## hint of a problem in s(x2) > > ## the following suggests a problem with s(x2) > gam.check(b) > > ## Another approach (see below for more obvious method).... > ## check for residual pattern, removeable by increasing `k' > ## typically `k', below, chould be substantially larger than > ## the original, `k' but certainly less than n/2. > ## Note use of cheap "cs" shrinkage smoothers, and gamma=1.4 > ## to reduce chance of overfitting... > rsd<- residuals(b) > gam(rsd~s(x0,k=40,bs="cs"),gamma=1.4,data=dat) ## fine > gam(rsd~s(x1,k=40,bs="cs"),gamma=1.4,data=dat) ## fine > /gam(rsd~s(x2,k=40,bs="cs"),gamma=1.4,data=dat) ## `k' too low/ > gam(rsd~s(x3,k=40,bs="cs"),gamma=1.4,data=dat) ## fine > > why the model is not good for x2? > >> gam(rsd~s(x2,k=40,bs="cs"),gamma=1.4,data=dat) ## `k' too low > Family: gaussian > Link function: identity > > Formula: > rsd ~ s(x2, k = 40, bs = "cs") > > Estimated degrees of freedom: > 9.0093 total = 10.00926 > > GCV score: 4.494652 > > For the results,we can see that the EDF is much less than K-1,so according > to > "If the effective degrees of freedom for a model term are estimated to be > much less than k-1 then this is unlikely to be very worthwhile",I think the > results are reasonable. > > Why? > > Thanks in advance > wanhai > > -- > View this message in context: http://r.789695.n4.nabble.com/check-k-function-in-mgcv-packages-tp4634050.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
This post was updated on .
Hi,thanks very much,
I have some another questions about GAM models. First,Is there some restrictions on the sample size? For example,I am studying the GDP and foreign direct investment on 29 provinces in China(N=29).Whether or not N is too samll? If so,could I use pooled data(N=29,T=5)? Second,Could I use the "mgcv" packages to implement panel data model through adding specific fixed effect and time fixed effect as the generzlied mixed model?(eg. adding factor(s)). For example, The paper written by Roberto Basile named "Regional economic growth in Europe;A semiparametric spatial dependence approach",published in Papers in Regional Science.In this paper,the author employ a semiparemetric spatial durbin model to analyse the growth behavior of 155 European regions in the period 1988-2000.I am not sure how to arrange the data? Thanks very much in advance. Best Regards, Wanhai |
| Powered by Nabble | Edit this page |
