Quantcast

variance explained by each term in a GAM

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

variance explained by each term in a GAM

Julian Burgos
Hello fellow R's,

I do apologize if this is a basic question.  I'm doing some GAMs using the mgcv package, and I am wondering what is the most appropriate way to determine how much of the variability in the dependent variable is explained by each term in the model.  The information provided by summary.gam() relates to the significance of each term (F, p-value) and to the "wiggliness" of the fitted smooth (edf), but (as  far as I understand) there is no information on the proportion of variance explained.

One alternative may be to fit alternative models without each term, and calculate the reduction in deviance.  For example:

m1=gam(y~s(x1) + s(x2)) # Full model
m2=gam(y~s(x2))
m3=gam(y~s(x1))

ddev1=deviance(m1)-deviance(m2)
ddev2=deviance(m1)-deviance(m3)

Here, ddev1 would measure the relative proportion of the variability in y explained by x1, and ddev2 would do the same for x2.  Does this sound like an appropriate approach?

Julian

Julian Burgos
FAR lab
University of Washington

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: variance explained by each term in a GAM

Simon Wood-4
I think that your approach is reasonable, except that you should use the same
smoothing parameters throughout. i.e the reduced models should use the same
smoothing parameters as the full model. Otherwise you get in trouble if x1
and x2 are correlated, since the smoothing parameters will then tend to
change alot when terms are dropped as one smooth tries to `do the work' of
the other. Here's an example, (which is modifiable to illustrate the problem
with not fixing the sp's)

 ## simulate some data
set.seed(0)
n<-400
x1 <- runif(n, 0, 1)
## to see problem with not fixing smoothing parameters
## remove the `##' from the next line, and the `sp'
## arguments from the `gam' calls generating b1 and b2.
x2 <- runif(n, 0, 1) ## *.1 + x1
f1 <- function(x) exp(2 * x)
f2 <- function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10
f <- f1(x1) + f2(x2)
e <- rnorm(n, 0, 2)
y <- f + e
## fit full and reduced models...
b <- gam(y~s(x1)+s(x2))
b1 <- gam(y~s(x1),sp=b$sp[1])
b2 <- gam(y~s(x2),sp=b$sp[2])
b0 <- gam(y~1)
## calculate proportions deviance explained...
(deviance(b1)-deviance(b))/deviance(b0) ## prop explained by s(x2)
(deviance(b2)-deviance(b))/deviance(b0) ## prop explained by s(x1)





On Monday 08 October 2007 20:19, Julian M Burgos wrote:

> Hello fellow R's,
>
> I do apologize if this is a basic question.  I'm doing some GAMs using the
> mgcv package, and I am wondering what is the most appropriate way to
> determine how much of the variability in the dependent variable is
> explained by each term in the model.  The information provided by
> summary.gam() relates to the significance of each term (F, p-value) and to
> the "wiggliness" of the fitted smooth (edf), but (as  far as I understand)
> there is no information on the proportion of variance explained.
>
> One alternative may be to fit alternative models without each term, and
> calculate the reduction in deviance.  For example:
>
> m1=gam(y~s(x1) + s(x2)) # Full model
> m2=gam(y~s(x2))
> m3=gam(y~s(x1))
>
> ddev1=deviance(m1)-deviance(m2)
> ddev2=deviance(m1)-deviance(m3)
>
> Here, ddev1 would measure the relative proportion of the variability in y
> explained by x1, and ddev2 would do the same for x2.  Does this sound like
> an appropriate approach?
>
> Julian
>
> Julian Burgos
> FAR lab
> University of Washington
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented, minimal,
> self-contained, reproducible code.

--
> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> +44 1225 386603  www.maths.bath.ac.uk/~sw283

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: variance explained by each term in a GAM

Julian Burgos
Thanks again for your answer, prof. Wood.

And my apologies for the list for my repeated message from yesterday.
Still trying to figure out what happened with my email software.

Julian

Simon Wood wrote:

> I think that your approach is reasonable, except that you should use the same
> smoothing parameters throughout. i.e the reduced models should use the same
> smoothing parameters as the full model. Otherwise you get in trouble if x1
> and x2 are correlated, since the smoothing parameters will then tend to
> change alot when terms are dropped as one smooth tries to `do the work' of
> the other. Here's an example, (which is modifiable to illustrate the problem
> with not fixing the sp's)
>
>  ## simulate some data
> set.seed(0)
> n<-400
> x1 <- runif(n, 0, 1)
> ## to see problem with not fixing smoothing parameters
> ## remove the `##' from the next line, and the `sp'
> ## arguments from the `gam' calls generating b1 and b2.
> x2 <- runif(n, 0, 1) ## *.1 + x1
> f1 <- function(x) exp(2 * x)
> f2 <- function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10
> f <- f1(x1) + f2(x2)
> e <- rnorm(n, 0, 2)
> y <- f + e
> ## fit full and reduced models...
> b <- gam(y~s(x1)+s(x2))
> b1 <- gam(y~s(x1),sp=b$sp[1])
> b2 <- gam(y~s(x2),sp=b$sp[2])
> b0 <- gam(y~1)
> ## calculate proportions deviance explained...
> (deviance(b1)-deviance(b))/deviance(b0) ## prop explained by s(x2)
> (deviance(b2)-deviance(b))/deviance(b0) ## prop explained by s(x1)
>
>
>
>
>
> On Monday 08 October 2007 20:19, Julian M Burgos wrote:
>> Hello fellow R's,
>>
>> I do apologize if this is a basic question.  I'm doing some GAMs using the
>> mgcv package, and I am wondering what is the most appropriate way to
>> determine how much of the variability in the dependent variable is
>> explained by each term in the model.  The information provided by
>> summary.gam() relates to the significance of each term (F, p-value) and to
>> the "wiggliness" of the fitted smooth (edf), but (as  far as I understand)
>> there is no information on the proportion of variance explained.
>>
>> One alternative may be to fit alternative models without each term, and
>> calculate the reduction in deviance.  For example:
>>
>> m1=gam(y~s(x1) + s(x2)) # Full model
>> m2=gam(y~s(x2))
>> m3=gam(y~s(x1))
>>
>> ddev1=deviance(m1)-deviance(m2)
>> ddev2=deviance(m1)-deviance(m3)
>>
>> Here, ddev1 would measure the relative proportion of the variability in y
>> explained by x1, and ddev2 would do the same for x2.  Does this sound like
>> an appropriate approach?
>>
>> Julian
>>
>> Julian Burgos
>> FAR lab
>> University of Washington
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html and provide commented, minimal,
>> self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: variance explained by each term in a GAM

Julian Burgos
In reply to this post by Simon Wood-4
Dear Prof. Wood,

Just another quick question.  I am doing model selection following Wood
and Augustin (2002).  One of the criteria for retaining a term is to see
if removing it causes an increase in the GCV score.  When doing this, do
I also need to fix the smooth parameters?

Thanks,

Julian Burgos

Fisheries Acoustics Research Lab
School of Aquatic and Fishery Science
University of Washington

1122 NE Boat Street
Seattle, WA  98105


Simon Wood wrote:

> I think that your approach is reasonable, except that you should use the same
> smoothing parameters throughout. i.e the reduced models should use the same
> smoothing parameters as the full model. Otherwise you get in trouble if x1
> and x2 are correlated, since the smoothing parameters will then tend to
> change alot when terms are dropped as one smooth tries to `do the work' of
> the other. Here's an example, (which is modifiable to illustrate the problem
> with not fixing the sp's)
>
>  ## simulate some data
> set.seed(0)
> n<-400
> x1 <- runif(n, 0, 1)
> ## to see problem with not fixing smoothing parameters
> ## remove the `##' from the next line, and the `sp'
> ## arguments from the `gam' calls generating b1 and b2.
> x2 <- runif(n, 0, 1) ## *.1 + x1
> f1 <- function(x) exp(2 * x)
> f2 <- function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10
> f <- f1(x1) + f2(x2)
> e <- rnorm(n, 0, 2)
> y <- f + e
> ## fit full and reduced models...
> b <- gam(y~s(x1)+s(x2))
> b1 <- gam(y~s(x1),sp=b$sp[1])
> b2 <- gam(y~s(x2),sp=b$sp[2])
> b0 <- gam(y~1)
> ## calculate proportions deviance explained...
> (deviance(b1)-deviance(b))/deviance(b0) ## prop explained by s(x2)
> (deviance(b2)-deviance(b))/deviance(b0) ## prop explained by s(x1)
>
>
>
>
>
> On Monday 08 October 2007 20:19, Julian M Burgos wrote:
>  
>> Hello fellow R's,
>>
>> I do apologize if this is a basic question.  I'm doing some GAMs using the
>> mgcv package, and I am wondering what is the most appropriate way to
>> determine how much of the variability in the dependent variable is
>> explained by each term in the model.  The information provided by
>> summary.gam() relates to the significance of each term (F, p-value) and to
>> the "wiggliness" of the fitted smooth (edf), but (as  far as I understand)
>> there is no information on the proportion of variance explained.
>>
>> One alternative may be to fit alternative models without each term, and
>> calculate the reduction in deviance.  For example:
>>
>> m1=gam(y~s(x1) + s(x2)) # Full model
>> m2=gam(y~s(x2))
>> m3=gam(y~s(x1))
>>
>> ddev1=deviance(m1)-deviance(m2)
>> ddev2=deviance(m1)-deviance(m3)
>>
>> Here, ddev1 would measure the relative proportion of the variability in y
>> explained by x1, and ddev2 would do the same for x2.  Does this sound like
>> an appropriate approach?
>>
>> Julian
>>
>> Julian Burgos
>> FAR lab
>> University of Washington
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html and provide commented, minimal,
>> self-contained, reproducible code.
>>    
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...