lmer model building--include random effects?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

lmer model building--include random effects?

Ista Zahn
Hello,
This is a follow up question to my previous one http://tolstoy.newcastle.edu.au/R/e4/help/08/02/3600.html

I am attempting to model relationship satisfaction (MAT) scores  
(measurements at 5 time points), using participant (spouseID) and  
couple id (ID) as grouping variables, and time (years) and conflict  
(MCI.c) as predictors. I have been instructed to include random  
effects for the slopes of both predictors as well as the intercepts,  
and then to drop non-significant random effects from the model. The  
instructor and the rest of the class is using HLM 6.0, which gives p-
values for random effects, and the procedure is simply to run a model,  
note which random effects are not significant, and drop them from the  
model. I was hoping I could to something analogous by using the anova  
function to compare models with and without a particular random  
effect, but I get dramatically different results than those obtained  
with HLM 6.0.

For example, I wanted to determine if I should include a random effect  
for the variable "MCI.c" (at the couple level), so I created two  
models, one with and one without, and compared them:

 > m.3 <- lmer(MAT ~ 1 + years + MCI.c + (1 + years | spouseID) + (1 +  
years + MCI.c | ID), data=Data, method = "ML")
 > m.1 <- lmer(MAT ~ 1 + years + MCI.c  + (1 + years + MCI.c |  
spouseID) + (1 + years + MCI.c | ID), data=Data, method = "ML")
 > anova(m.1, m.3)
Data: Data
Models:
m.3: MAT ~ 1 + years + MCI.c + (1 + years | spouseID) + (1 + years +
m.1:     MCI.c | ID)
m.3: MAT ~ 1 + years + MCI.c + (1 + years + MCI.c | spouseID) + (1 +
m.1:     years + MCI.c | ID)
     Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)
m.3 12  5777.8  5832.7 -2876.9
m.1 15  5780.9  5849.5 -2875.4 2.9428      3     0.4005

The corresponding output from HLM 6.0 reads

  Random Effect           Standard      Variance     df    Chi-
square   P-value
                          Deviation     Component
   
------------------------------------------------------------------------------
  INTRCPT1,       R0      6.80961      46.37075      60      
112.80914    0.000
     YEARS slope, R1      1.49329       2.22991      60      
59.38729    >.500
       MCI slope, R2      5.45608      29.76881      60      
90.57615    0.007
   
------------------------------------------------------------------------------

To me, this seems to indicate that HLM 6.0 is suggesting that the  
random effect should be included in the model, while R is suggesting  
that it need not be. This is not (quite) a "why do I get different  
results with X" post, but rather an "I'm worried that I might be doing  
something wrong" post. Does what I've done look reasonable? Is there a  
better way to go about it?

Thank you very much for reading this, and for any advice.
-Ista

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: lmer model building--include random effects?

Douglas Bates-2
On 4/22/08, Ista Zahn <[hidden email]> wrote:
> Hello,
>  This is a follow up question to my previous one http://tolstoy.newcastle.edu.au/R/e4/help/08/02/3600.html

>  I am attempting to model relationship satisfaction (MAT) scores
>  (measurements at 5 time points), using participant (spouseID) and
>  couple id (ID) as grouping variables, and time (years) and conflict
>  (MCI.c) as predictors. I have been instructed to include random
>  effects for the slopes of both predictors as well as the intercepts,
>  and then to drop non-significant random effects from the model. The
>  instructor and the rest of the class is using HLM 6.0, which gives p-
>  values for random effects, and the procedure is simply to run a model,
>  note which random effects are not significant, and drop them from the
>  model. I was hoping I could to something analogous by using the anova
>  function to compare models with and without a particular random
>  effect, but I get dramatically different results than those obtained
>  with HLM 6.0.

>  For example, I wanted to determine if I should include a random effect
>  for the variable "MCI.c" (at the couple level), so I created two
>  models, one with and one without, and compared them:

>   > m.3 <- lmer(MAT ~ 1 + years + MCI.c + (1 + years | spouseID) + (1 +
>  years + MCI.c | ID), data=Data, method = "ML")
>   > m.1 <- lmer(MAT ~ 1 + years + MCI.c  + (1 + years + MCI.c |
>  spouseID) + (1 + years + MCI.c | ID), data=Data, method = "ML")
>   > anova(m.1, m.3)
>  Data: Data
>  Models:
>  m.3: MAT ~ 1 + years + MCI.c + (1 + years | spouseID) + (1 + years +
>  m.1:     MCI.c | ID)
>  m.3: MAT ~ 1 + years + MCI.c + (1 + years + MCI.c | spouseID) + (1 +
>  m.1:     years + MCI.c | ID)
>      Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)
>  m.3 12  5777.8  5832.7 -2876.9
>  m.1 15  5780.9  5849.5 -2875.4 2.9428      3     0.4005

>  The corresponding output from HLM 6.0 reads

>   Random Effect           Standard      Variance     df    Chi- square   P-value
>                           Deviation     Component

>  ------------------------------------------------------------------------------
>   INTRCPT1,       R0      6.80961      46.37075      60  112.80914    0.000
>      YEARS slope, R1      1.49329       2.22991      60  59.38729    >.500
>        MCI slope, R2      5.45608      29.76881      60   90.57615    0.007
>  ------------------------------------------------------------------------------

>  To me, this seems to indicate that HLM 6.0 is suggesting that the
>  random effect should be included in the model, while R is suggesting
>  that it need not be. This is not (quite) a "why do I get different
>  results with X" post, but rather an "I'm worried that I might be doing
>  something wrong" post. Does what I've done look reasonable? Is there a
>  better way to go about it

The first thing I would try to determine is whether the model fit by
HLM is equivalent to the model you have fit with lmer.  The part of
the HLM model output you have shown lists only variance components.
It does not provide covariances or correlations.  The lmer model is
fitting a 3 by 3 symmetric positive definite variance-covariance
matrix with a total of 6 parameters - 3 variances and 3 covariances.
It may be that HLM is fitting a simpler model in which the covariances
are all zero.

The next question would be exactly how HLM is calculating that
p-value.  I wonder where the 60 degrees of freedom comes from.  Do you
happen to have 60 couples in the study?

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: lmer model building--include random effects?

Ista Zahn

On Apr 23, 2008, at 8:56 AM, Douglas Bates wrote:

> On 4/22/08, Ista Zahn <[hidden email]> wrote:
>> Hello,
>> This is a follow up question to my previous one http://tolstoy.newcastle.edu.au/R/e4/help/08/02/3600.html
>
>> I am attempting to model relationship satisfaction (MAT) scores
>> (measurements at 5 time points), using participant (spouseID) and
>> couple id (ID) as grouping variables, and time (years) and conflict
>> (MCI.c) as predictors. I have been instructed to include random
>> effects for the slopes of both predictors as well as the intercepts,
>> and then to drop non-significant random effects from the model. The
>> instructor and the rest of the class is using HLM 6.0, which gives p-
>> values for random effects, and the procedure is simply to run a  
>> model,
>> note which random effects are not significant, and drop them from the
>> model. I was hoping I could to something analogous by using the anova
>> function to compare models with and without a particular random
>> effect, but I get dramatically different results than those obtained
>> with HLM 6.0.
>
>> For example, I wanted to determine if I should include a random  
>> effect
>> for the variable "MCI.c" (at the couple level), so I created two
>> models, one with and one without, and compared them:
>
>>> m.3 <- lmer(MAT ~ 1 + years + MCI.c + (1 + years | spouseID) + (1 +
>> years + MCI.c | ID), data=Data, method = "ML")
>>> m.1 <- lmer(MAT ~ 1 + years + MCI.c  + (1 + years + MCI.c |
>> spouseID) + (1 + years + MCI.c | ID), data=Data, method = "ML")
>>> anova(m.1, m.3)
>> Data: Data
>> Models:
>> m.3: MAT ~ 1 + years + MCI.c + (1 + years | spouseID) + (1 + years +
>> m.1:     MCI.c | ID)
>> m.3: MAT ~ 1 + years + MCI.c + (1 + years + MCI.c | spouseID) + (1 +
>> m.1:     years + MCI.c | ID)
>>     Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)
>> m.3 12  5777.8  5832.7 -2876.9
>> m.1 15  5780.9  5849.5 -2875.4 2.9428      3     0.4005
>
>> The corresponding output from HLM 6.0 reads
>
>>  Random Effect           Standard      Variance     df    Chi-  
>> square   P-value
>>                          Deviation     Component
>
>> ------------------------------------------------------------------------------
>>  INTRCPT1,       R0      6.80961      46.37075      60  
>> 112.80914    0.000
>>     YEARS slope, R1      1.49329       2.22991      60  59.38729    
>> >.500
>>       MCI slope, R2      5.45608      29.76881      60    
>> 90.57615    0.007
>> ------------------------------------------------------------------------------
>
>> To me, this seems to indicate that HLM 6.0 is suggesting that the
>> random effect should be included in the model, while R is suggesting
>> that it need not be. This is not (quite) a "why do I get different
>> results with X" post, but rather an "I'm worried that I might be  
>> doing
>> something wrong" post. Does what I've done look reasonable? Is  
>> there a
>> better way to go about it
>
> The first thing I would try to determine is whether the model fit by
> HLM is equivalent to the model you have fit with lmer.  The part of
> the HLM model output you have shown lists only variance components.
> It does not provide covariances or correlations.  The lmer model is
> fitting a 3 by 3 symmetric positive definite variance-covariance
> matrix with a total of 6 parameters - 3 variances and 3 covariances.
> It may be that HLM is fitting a simpler model in which the covariances
> are all zero.
>
Yes, I was also concerned that the model I fit in R may not be exactly  
the model fit by HLM. The estimates are similar but not exact. The  
model summaries from HLM and R are as follows:

HLM OUTPUT:


   The outcome variable is      MAT

   The model specified for the fixed effects was:
  ----------------------------------------------------

    Level-1                Level-2           Level-3
    Coefficients           Predictors        Predictors
  ---------------------  ---------------   ----------------
         INTRCPT1, P0      INTRCPT2, B00    INTRCPT3, G000
      YEARS slope, P1      INTRCPT2, B10    INTRCPT3, G100
  *     MCI slope, P2      INTRCPT2, B20    INTRCPT3, G200

  '*' - This variable has been centered around its group mean


  Summary of the model specified (in equation format)
  ---------------------------------------------------

Level-1 Model

        Y = P0 + P1*(YEARS) + P2*(MCI) + E

Level-2 Model

        P0 = B00 + R0
        P1 = B10 + R1
        P2 = B20 + R2


Level-3 Model

        B00 = G000 + U00
        B10 = G100 + U10
        B20 = G200 + U20

Run-time deletion has reduced the number of level-1 records to 716

For starting values, data from 716 level-1 and 120 level-2 records  
were used

Iterations stopped due to small change in likelihood function
******* ITERATION 1008 *******

  Sigma_squared =    110.43050

  Standard Error of Sigma_squared =      7.77797

Tau(pi)
  INTRCPT1,P0     46.37075      5.48151     -5.91342
     YEARS,P1      5.48151      2.22991      5.80536
       MCI,P2     -5.91342      5.80536     29.76881


Tau(pi) (as correlations)
  INTRCPT1,P0  1.000  0.539 -0.159
     YEARS,P1  0.539  1.000  0.713
       MCI,P2 -0.159  0.713  1.000

Standard Errors of Tau(pi)
  INTRCPT1,P0     16.35658      6.70855     14.39441
     YEARS,P1      6.70855      4.81811      8.57942
       MCI,P2     14.39441      8.57942     22.49454


  ----------------------------------------------------
   Random level-1 coefficient   Reliability estimate
  ----------------------------------------------------
   INTRCPT1, P0                        0.482
      YEARS, P1                        0.074
        MCI, P2                        0.202
  ----------------------------------------------------

Tau(beta)
  INTRCPT1        YEARS          MCI
  INTRCPT2,B00 INTRCPT2,B10 INTRCPT2,B20
   116.72824     -0.09171     14.21000
    -0.09171      2.81646     -0.17231
    14.21000     -0.17231      1.90065

Tau(beta) (as correlations)
  INTRCPT1/INTRCPT2,B00  1.000 -0.005  0.954
     YEARS/INTRCPT2,B10 -0.005  1.000 -0.074
       MCI/INTRCPT2,B20  0.954 -0.074  1.000

Standard Errors of Tau(beta)
  INTRCPT1        YEARS          MCI
  INTRCPT2,B00 INTRCPT2,B10 INTRCPT2,B20
    30.50218      7.45597     15.16454
     7.45597      3.73945      6.40333
    15.16454      6.40333     16.72669

  ----------------------------------------------------
   Random level-2 coefficient   Reliability estimate
  ----------------------------------------------------
   INTRCPT1/INTRCPT2, B00               0.716
      YEARS/INTRCPT2, B10               0.167
        MCI/INTRCPT2, B20               0.027
  ----------------------------------------------------

The value of the likelihood function at iteration 1008 = -2.859327E+003
The outcome variable is      MAT

  Final estimation of fixed effects:
   
----------------------------------------------------------------------------
                                       Standard             Approx.
     Fixed Effect        Coefficient   Error      T-ratio   d.f.     P-
value
   
----------------------------------------------------------------------------
  For       INTRCPT1, P0
     For INTRCPT2, B00
       INTRCPT3, G000    124.486031    1.638650     75.969       59    
0.000
  For    YEARS slope, P1
     For INTRCPT2, B10
       INTRCPT3, G100     -6.257696    0.518369    -12.072       59    
0.000
  For      MCI slope, P2
     For INTRCPT2, B20
       INTRCPT3, G200     -3.698127    1.052597     -3.513       59    
0.001
   
----------------------------------------------------------------------------

  The outcome variable is      MAT

  Final estimation of fixed effects
  (with robust standard errors)
   
----------------------------------------------------------------------------
                                       Standard             Approx.
     Fixed Effect        Coefficient   Error      T-ratio   d.f.     P-
value
   
----------------------------------------------------------------------------
  For       INTRCPT1, P0
     For INTRCPT2, B00
       INTRCPT3, G000    124.486031    1.632622     76.249       59    
0.000
  For    YEARS slope, P1
     For INTRCPT2, B10
       INTRCPT3, G100     -6.257696    0.512071    -12.220       59    
0.000
  For      MCI slope, P2
     For INTRCPT2, B20
       INTRCPT3, G200     -3.698127    1.003180     -3.686       59    
0.001
   
----------------------------------------------------------------------------



  Final estimation of level-1 and level-2 variance components:
   
------------------------------------------------------------------------------
  Random Effect           Standard      Variance     df    Chi-
square   P-value
                          Deviation     Component
   
------------------------------------------------------------------------------
  INTRCPT1,       R0      6.80961      46.37075      60      
112.80914    0.000
     YEARS slope, R1      1.49329       2.22991      60      
59.38729    >.500
       MCI slope, R2      5.45608      29.76881      60      
90.57615    0.007
   level-1,       E      10.50859     110.43050
   
------------------------------------------------------------------------------


  Final estimation of level-3 variance components:
   
------------------------------------------------------------------------------
  Random Effect           Standard      Variance     df    Chi-
square   P-value
                          Deviation     Component
   
------------------------------------------------------------------------------
INTRCPT1/INTRCPT2, U00    10.80408     116.72824    59      
208.29109    0.000
    YEARS/INTRCPT2, U10     1.67823       2.81646    59      
75.45893    0.073
      MCI/INTRCPT2, U20     1.37864       1.90065    59      
64.47689    0.291
   
------------------------------------------------------------------------------


  Statistics for current covariance components model
  --------------------------------------------------
Deviance                       = 5718.653097
Number of estimated parameters = 16


R OUTPUT

 > m.1 <- lmer(MAT ~ 1 + years + MCI.c  + (1 + years + MCI.c |  
spouseID) + (1 + years + MCI.c | ID), data=Data, method = "ML")
 > summary(m.1)
Linear mixed-effects model fit by maximum likelihood
Formula: MAT ~ 1 + years + MCI.c + (1 + years + MCI.c | spouseID) + (1  
+      years + MCI.c | ID)
    Data: Data
   AIC  BIC logLik MLdeviance REMLdeviance
  5781 5850  -2875       5751         5746
Random effects:
  Groups   Name        Variance  Std.Dev. Corr
  spouseID (Intercept)  45.90014  6.77496
           years         2.52945  1.59042  0.559
           MCI.c        28.31849  5.32151 -0.082  0.781
  ID       (Intercept) 124.32049 11.14991
           years         2.82449  1.68062 -0.218
           MCI.c         0.20084  0.44815  0.140 -0.533
  Residual             110.96358 10.53392
number of obs: 720, groups: spouseID, 120; ID, 60

Fixed effects:
             Estimate Std. Error t value
(Intercept) 124.4506     1.6757   74.27
years        -6.2569     0.5211  -12.01
MCI.c        -3.5637     1.0228   -3.48

Correlation of Fixed Effects:
       (Intr) years
years -0.251
MCI.c -0.152  0.567


The results do differ in places, but most of the estimates are similar  
so I was assuming that the differences were due to different  
underlying algorithms used by the two programs. I may well have been  
wrong about this.

> The next question would be exactly how HLM is calculating that
> p-value.  I wonder where the 60 degrees of freedom comes from.  Do you
> happen to have 60 couples in the study?

Yes, there are 60 couples in the study. I believe HLM is calculating  
what Raudenbush and Bryk (2002) call a "Univariate Chi-square" which  
is what they recommend for testing single parameter variance  
components. For multi parameter variance components they recommend a  
likelihood ratio test, which is what I believe the method I used above  
gives.

Thank you for taking the time to respond to my question,
Ista
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.