Thanks a lot, Simon. That was a silly mistake on my part. After correcting,

small that I guess it is just numerics.

> Dear Øystein,

>

> In your code 'id' is set up to be numeric. In your first gamm call it

> gets coerced to a factor (since nothing else would make sense). In your

> second gamm call it is simply treated as numeric, since that could makes

> sense. To make the two models equivalent you just need to substitute:

>

> id <- as.factor(rep(1:ng, n/ng))

>

> into your loop.

>

> best,

> Simon

>

> On 01/10/18 08:15, Øystein Sørensen wrote:

> > I have longitudinal data where a response y_{ij} has been measured

> > repeatedly for i=1,...,n individuals, j=1,...,m times for each

> individual.

> > Let x_{ij} be the age of individual i at her/his jth measurement. I wish

> to

> > see how the response varies with age, and have reason to believe that a

> > nonlinear fit is needed. I hence wish to model the relationship using an

> > additive mixed model of the form y_{ij} = f(x_{ij}) + b_{i} +

> > \epsilon_{ij}, where f() is some smooth function to be estimated, b_{i}

> > \sim N(0, \sigma_{b}^{2}) is the random effect for individual i,

> > \epsilon_{ij} \sim N(0, \sigma^{2}) is the residual.

> >

> > Reading the documentation to the mgcv package, I see that such models can

> > be set up by calling the mgcv::gamm two different ways. One way shown to

> > set up such a model is with the call

> > b1 <- gamm(y ~ s(x), random = list(id =~ 1)),

> > where id is an indicator of the specific individual. In this way, the

> > random effect is specified in a list. The other way is to set up the

> random

> > effect with a smooth of the re:

> > b2 <- gamm(y ~ s(x) + s(id, bs = "re")).

> >

> > As far as I can understand, these two setups should create the same

> > underlying model matrices. However, when running them on my data, the

> > p-values for the smooth terms, as well as their estimated degrees of

> > freedom, are different.

> >

> > Below is a reproducible example on simulated data, which show that these

> > two types of specifying the models give different p-values and estimated

> > degrees of freedom. On my real data, these differences are sometimes even

> > more exaggerated.

> >

> > My question is: Are not these two calls to gamm suppose to estimate the

> > same model, or have I misunderstood?

> >

> > Here is a reproducible example:

> >

> > library(mgcv)

> > set.seed(100)

> > n_sim <- 100

> > df <- data.frame(

> > model1_pval = numeric(n_sim),

> > model1_edf = numeric(n_sim),

> > model2_pval = numeric(n_sim),

> > model2_edf = numeric(n_sim)

> > )

> >

> > # Number of observations

> > n <- 500

> > # Number of groups

> > ng <- 100

> >

> > for(i in 1:n_sim){

> > # Draw the fixed effect covariate

> > x <- rnorm(n)

> > # Set up the group

> > id <- rep(1:ng, n/ng)

> > # Draw the random effect

> > z <- rnorm(ng)

> > # Draw the response

> > y <- 0.1 * x + 0.2 * x^3 + z[id] + rnorm(n)

> >

> > # Fit the two different models

> > b1 <- gamm(y ~ s(x, k = 5), random = list(id =~ 1))

> > b2 <- gamm(y ~ s(x, k = 5) + s(id, bs = "re"))

> >

> > df[i, "model1_pval"] <- anova(b1$gam)$s.pv

> > df[i, "model1_edf"] <- anova(b1$gam)$edf

> > df[i, "model2_pval"] <- anova(b2$gam)$s.pv[[1]]

> > df[i, "model2_edf"] <- anova(b2$gam)$edf[[1]]

> > }

> >

> > # Differences between model 1 and model 2 p values

> > mean(df$model1_pval)

> > #> [1] 6.790546e-21

> > mean(df$model2_pval)

> > #> [1] 3.090694e-14

> > max(abs(df$model1_pval - df$model2_pval))

> > #> [1] 2.770545e-12

> >

> > # Differences between model1 and model 2 estimated degrees of freedom

> > mean(df$model1_edf)

> > #> [1] 3.829992

> > mean(df$model2_edf)

> > #> [1] 3.731438

> > max(abs(df$model1_edf - df$model2_edf))

> > #> [1] 0.6320281

> >

> >

> > Thanks in advance,

> > Øystein Sørensen

> >

> > [[alternative HTML version deleted]]

> >

> > ______________________________________________

> >

[hidden email] mailing list -- To UNSUBSCRIBE and more, see

> >

https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide

>

http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code.

>

>