standard error for regression coefficients corresponding to factor levels

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

standard error for regression coefficients corresponding to factor levels

li li-13
Hi all,
  I have the following data called "data1". After fitting the ancova model
with different slopes and intercepts for each region, I calculated the
regression coefficients and the corresponding standard error. The standard
error (for intercept or for slope) are all the same for different regions.
Is there something wrong?
  I know the SE is related to (X^T X)^-1, where X is design matrix. So does
this happen whenever each factor level has the same set of values for
"week"?
     Thanks.
     Hanna



> mod <- lm(response ~ region*week, data1)> tmp <- coef(summary(mod))> res <- matrix(NA, 5,4)> res[1,1:2] <- tmp[1,1:2]> res[2:5,1] <- tmp[1,1]+tmp[2:5,1]> res[2:5,2] <- sqrt(tmp[2:5,2]^2-tmp[1,2]^2)> res[1,3:4] <- tmp[6,1:2]> res[2:5,3] <- tmp[6,1]+tmp[7:10,1]> res[2:5,4] <- sqrt(tmp[7:10,2]^2-tmp[6,2]^2)

> colnames(res) <- c("intercept", "intercept SE", "slope", "slope SE")> rownames(res) <- letters[1:5]> res   intercept intercept SE        slope   slope SE
a 0.18404464   0.08976301 -0.018629310 0.01385073
b 0.17605666   0.08976301 -0.022393789 0.01385073
c 0.16754130   0.08976301 -0.022367770 0.01385073
d 0.12554452   0.08976301 -0.017464385 0.01385073
e 0.06153256   0.08976301  0.007714685 0.01385073







> data1    week region     response
5      3      c  0.057325067
6      6      c  0.066723632
7      9      c -0.025317808
12     3      d  0.024692613
13     6      d  0.021761492
14     9      d -0.099820335
19     3      c  0.119559235
20     6      c -0.054456186
21     9      c  0.078811180
26     3      d  0.091667189
27     6      d -0.053400777
28     9      d  0.090754363
33     3      c  0.163818085
34     6      c  0.008959741
35     9      c -0.115410852
40     3      d  0.193920693
41     6      d -0.087738914
42     9      d  0.004987542
47     3      a  0.121332285
48     6      a -0.020202707
49     9      a  0.037295785
54     3      b  0.214304603
55     6      b -0.052346480
56     9      b  0.082501222
61     3      a  0.053540767
62     6      a -0.019182819
63     9      a -0.057629113
68     3      b  0.068592791
69     6      b -0.123298216
70     9      b -0.230671818
75     3      a  0.330741562
76     6      a  0.013902905
77     9      a  0.190620360
82     3      b  0.151002874
83     6      b  0.086177696
84     9      b  0.178982656
89     3      e  0.062974799
90     6      e  0.062035391
91     9      e  0.206200831
96     3      e  0.123102197
97     6      e  0.040181790
98     9      e  0.121332285
103    3      e  0.147557564
104    6      e  0.062035391
105    9      e  0.144965770

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [FORGED] standard error for regression coefficients corresponding to factor levels

Rolf Turner

You have been posting to the R-help list long enough so that you should
have learned by now *not* to post in html.  Your code is mangled so as
to be unreadable.

A few comments:

(1) Your data frame "data1" seems to have a mysterious (and irrelevant?)
column named "data1" as well.

(2) The covariance matrix of your coefficient estimates is indeed (as
you hint) a constant multiple of (X^T X)^{-1}.  So do:

     X <- model.matrix(~response*week,data=data1)
     S <- solve(t(X)%*%X)
     print(S)

and you will see the same pattern of constancy that your results exhibit.

(3) You could get the results you want much more easily, without all the
fooling around buried in your (illegible) code, by doing:

     mod <- lm(response ~ (region - 1)/week,data=data1)
     summary(mod)

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 17/03/17 07:26, li li wrote:

> Hi all,
>   I have the following data called "data1". After fitting the ancova model
> with different slopes and intercepts for each region, I calculated the
> regression coefficients and the corresponding standard error. The standard
> error (for intercept or for slope) are all the same for different regions.
> Is there something wrong?
>   I know the SE is related to (X^T X)^-1, where X is design matrix. So does
> this happen whenever each factor level has the same set of values for
> "week"?
>      Thanks.
>      Hanna
>
>
>
>> mod <- lm(response ~ region*week, data1)> tmp <- coef(summary(mod))> res <- matrix(NA, 5,4)> res[1,1:2] <- tmp[1,1:2]> res[2:5,1] <- tmp[1,1]+tmp[2:5,1]> res[2:5,2] <- sqrt(tmp[2:5,2]^2-tmp[1,2]^2)> res[1,3:4] <- tmp[6,1:2]> res[2:5,3] <- tmp[6,1]+tmp[7:10,1]> res[2:5,4] <- sqrt(tmp[7:10,2]^2-tmp[6,2]^2)
>
>> colnames(res) <- c("intercept", "intercept SE", "slope", "slope SE")> rownames(res) <- letters[1:5]> res   intercept intercept SE        slope   slope SE
> a 0.18404464   0.08976301 -0.018629310 0.01385073
> b 0.17605666   0.08976301 -0.022393789 0.01385073
> c 0.16754130   0.08976301 -0.022367770 0.01385073
> d 0.12554452   0.08976301 -0.017464385 0.01385073
> e 0.06153256   0.08976301  0.007714685 0.01385073
>
>
>
>
>
>
>
>> data1    week region     response
> 5      3      c  0.057325067
> 6      6      c  0.066723632
> 7      9      c -0.025317808
> 12     3      d  0.024692613
> 13     6      d  0.021761492
> 14     9      d -0.099820335
> 19     3      c  0.119559235
> 20     6      c -0.054456186
> 21     9      c  0.078811180
> 26     3      d  0.091667189
> 27     6      d -0.053400777
> 28     9      d  0.090754363
> 33     3      c  0.163818085
> 34     6      c  0.008959741
> 35     9      c -0.115410852
> 40     3      d  0.193920693
> 41     6      d -0.087738914
> 42     9      d  0.004987542
> 47     3      a  0.121332285
> 48     6      a -0.020202707
> 49     9      a  0.037295785
> 54     3      b  0.214304603
> 55     6      b -0.052346480
> 56     9      b  0.082501222
> 61     3      a  0.053540767
> 62     6      a -0.019182819
> 63     9      a -0.057629113
> 68     3      b  0.068592791
> 69     6      b -0.123298216
> 70     9      b -0.230671818
> 75     3      a  0.330741562
> 76     6      a  0.013902905
> 77     9      a  0.190620360
> 82     3      b  0.151002874
> 83     6      b  0.086177696
> 84     9      b  0.178982656
> 89     3      e  0.062974799
> 90     6      e  0.062035391
> 91     9      e  0.206200831
> 96     3      e  0.123102197
> 97     6      e  0.040181790
> 98     9      e  0.121332285
> 103    3      e  0.147557564
> 104    6      e  0.062035391
> 105    9      e  0.144965770
>
> [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [FORGED] standard error for regression coefficients corresponding to factor levels

Doran, Harold
A slightly more ³R-ish² way of doing

S <- solve(t(X)%*%X)

Is to instead use

S <- solve(crossprod(X))

And the idea idea of inverting the SSCP matrix only and not actually
solving the linear system is not so great, which is why it is better to do
as Rolf is suggesting and get all things you need from lm, which uses
decompositions and not the algebraic representations for the solution to
the linear system.

On 3/16/17, 7:41 PM, "Rolf Turner" <[hidden email]> wrote:

>
>You have been posting to the R-help list long enough so that you should
>have learned by now *not* to post in html.  Your code is mangled so as
>to be unreadable.
>
>A few comments:
>
>(1) Your data frame "data1" seems to have a mysterious (and irrelevant?)
>column named "data1" as well.
>
>(2) The covariance matrix of your coefficient estimates is indeed (as
>you hint) a constant multiple of (X^T X)^{-1}.  So do:
>
>     X <- model.matrix(~response*week,data=data1)
>     S <- solve(t(X)%*%X)
>     print(S)
>
>and you will see the same pattern of constancy that your results exhibit.
>
>(3) You could get the results you want much more easily, without all the
>fooling around buried in your (illegible) code, by doing:
>
>     mod <- lm(response ~ (region - 1)/week,data=data1)
>     summary(mod)
>
>cheers,
>
>Rolf Turner
>
>--
>Technical Editor ANZJS
>Department of Statistics
>University of Auckland
>Phone: +64-9-373-7599 ext. 88276
>
>On 17/03/17 07:26, li li wrote:
>> Hi all,
>>   I have the following data called "data1". After fitting the ancova
>>model
>> with different slopes and intercepts for each region, I calculated the
>> regression coefficients and the corresponding standard error. The
>>standard
>> error (for intercept or for slope) are all the same for different
>>regions.
>> Is there something wrong?
>>   I know the SE is related to (X^T X)^-1, where X is design matrix. So
>>does
>> this happen whenever each factor level has the same set of values for
>> "week"?
>>      Thanks.
>>      Hanna
>>
>>
>>
>>> mod <- lm(response ~ region*week, data1)> tmp <- coef(summary(mod))>
>>>res <- matrix(NA, 5,4)> res[1,1:2] <- tmp[1,1:2]> res[2:5,1] <-
>>>tmp[1,1]+tmp[2:5,1]> res[2:5,2] <- sqrt(tmp[2:5,2]^2-tmp[1,2]^2)>
>>>res[1,3:4] <- tmp[6,1:2]> res[2:5,3] <- tmp[6,1]+tmp[7:10,1]>
>>>res[2:5,4] <- sqrt(tmp[7:10,2]^2-tmp[6,2]^2)
>>
>>> colnames(res) <- c("intercept", "intercept SE", "slope", "slope SE")>
>>>rownames(res) <- letters[1:5]> res   intercept intercept SE
>>>slope   slope SE
>> a 0.18404464   0.08976301 -0.018629310 0.01385073
>> b 0.17605666   0.08976301 -0.022393789 0.01385073
>> c 0.16754130   0.08976301 -0.022367770 0.01385073
>> d 0.12554452   0.08976301 -0.017464385 0.01385073
>> e 0.06153256   0.08976301  0.007714685 0.01385073
>>
>>
>>
>>
>>
>>
>>
>>> data1    week region     response
>> 5      3      c  0.057325067
>> 6      6      c  0.066723632
>> 7      9      c -0.025317808
>> 12     3      d  0.024692613
>> 13     6      d  0.021761492
>> 14     9      d -0.099820335
>> 19     3      c  0.119559235
>> 20     6      c -0.054456186
>> 21     9      c  0.078811180
>> 26     3      d  0.091667189
>> 27     6      d -0.053400777
>> 28     9      d  0.090754363
>> 33     3      c  0.163818085
>> 34     6      c  0.008959741
>> 35     9      c -0.115410852
>> 40     3      d  0.193920693
>> 41     6      d -0.087738914
>> 42     9      d  0.004987542
>> 47     3      a  0.121332285
>> 48     6      a -0.020202707
>> 49     9      a  0.037295785
>> 54     3      b  0.214304603
>> 55     6      b -0.052346480
>> 56     9      b  0.082501222
>> 61     3      a  0.053540767
>> 62     6      a -0.019182819
>> 63     9      a -0.057629113
>> 68     3      b  0.068592791
>> 69     6      b -0.123298216
>> 70     9      b -0.230671818
>> 75     3      a  0.330741562
>> 76     6      a  0.013902905
>> 77     9      a  0.190620360
>> 82     3      b  0.151002874
>> 83     6      b  0.086177696
>> 84     9      b  0.178982656
>> 89     3      e  0.062974799
>> 90     6      e  0.062035391
>> 91     9      e  0.206200831
>> 96     3      e  0.123102197
>> 97     6      e  0.040181790
>> 98     9      e  0.121332285
>> 103    3      e  0.147557564
>> 104    6      e  0.062035391
>> 105    9      e  0.144965770
>>
>> [[alternative HTML version deleted]]
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [FORGED] standard error for regression coefficients corresponding to factor levels

Doran, Harold
Just to do a better job in what is perhaps more "R-ish" w.r.t least squares calculations, here is perhaps a better example

ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)
lm.D9 <- lm(weight ~ group)

X <- model.matrix(lm.D9)
Xqr <- qr(X)
Q <- qr.Q(Xqr)
R <- qr.R(Xqr)


### this is same as just (X'X)^-1
 chol2inv(R)

### Compared to the prior way I mentioned

 solve(crossprod(X))

-----Original Message-----
From: Doran, Harold
Sent: Friday, March 17, 2017 5:06 AM
To: Rolf Turner <[hidden email]>; li li <[hidden email]>
Cc: r-help <[hidden email]>
Subject: Re: [R] [FORGED] standard error for regression coefficients corresponding to factor levels

A slightly more ³R-ish² way of doing

S <- solve(t(X)%*%X)

Is to instead use

S <- solve(crossprod(X))

And the idea idea of inverting the SSCP matrix only and not actually solving the linear system is not so great, which is why it is better to do as Rolf is suggesting and get all things you need from lm, which uses decompositions and not the algebraic representations for the solution to the linear system.

On 3/16/17, 7:41 PM, "Rolf Turner" <[hidden email]> wrote:

>
>You have been posting to the R-help list long enough so that you should
>have learned by now *not* to post in html.  Your code is mangled so as
>to be unreadable.
>
>A few comments:
>
>(1) Your data frame "data1" seems to have a mysterious (and
>irrelevant?) column named "data1" as well.
>
>(2) The covariance matrix of your coefficient estimates is indeed (as
>you hint) a constant multiple of (X^T X)^{-1}.  So do:
>
>     X <- model.matrix(~response*week,data=data1)
>     S <- solve(t(X)%*%X)
>     print(S)
>
>and you will see the same pattern of constancy that your results exhibit.
>
>(3) You could get the results you want much more easily, without all
>the fooling around buried in your (illegible) code, by doing:
>
>     mod <- lm(response ~ (region - 1)/week,data=data1)
>     summary(mod)
>
>cheers,
>
>Rolf Turner
>
>--
>Technical Editor ANZJS
>Department of Statistics
>University of Auckland
>Phone: +64-9-373-7599 ext. 88276
>
>On 17/03/17 07:26, li li wrote:
>> Hi all,
>>   I have the following data called "data1". After fitting the ancova
>>model  with different slopes and intercepts for each region, I
>>calculated the  regression coefficients and the corresponding standard
>>error. The standard  error (for intercept or for slope) are all the
>>same for different regions.
>> Is there something wrong?
>>   I know the SE is related to (X^T X)^-1, where X is design matrix.
>>So does  this happen whenever each factor level has the same set of
>>values for  "week"?
>>      Thanks.
>>      Hanna
>>
>>
>>
>>> mod <- lm(response ~ region*week, data1)> tmp <- coef(summary(mod))>
>>>res <- matrix(NA, 5,4)> res[1,1:2] <- tmp[1,1:2]> res[2:5,1] <-
>>>tmp[1,1]+tmp[2:5,1]> res[2:5,2] <- sqrt(tmp[2:5,2]^2-tmp[1,2]^2)>
>>>res[1,3:4] <- tmp[6,1:2]> res[2:5,3] <- tmp[6,1]+tmp[7:10,1]>
>>>res[2:5,4] <- sqrt(tmp[7:10,2]^2-tmp[6,2]^2)
>>
>>> colnames(res) <- c("intercept", "intercept SE", "slope", "slope SE")>
>>>rownames(res) <- letters[1:5]> res   intercept intercept SE
>>>slope   slope SE
>> a 0.18404464   0.08976301 -0.018629310 0.01385073
>> b 0.17605666   0.08976301 -0.022393789 0.01385073
>> c 0.16754130   0.08976301 -0.022367770 0.01385073
>> d 0.12554452   0.08976301 -0.017464385 0.01385073
>> e 0.06153256   0.08976301  0.007714685 0.01385073
>>
>>
>>
>>
>>
>>
>>
>>> data1    week region     response
>> 5      3      c  0.057325067
>> 6      6      c  0.066723632
>> 7      9      c -0.025317808
>> 12     3      d  0.024692613
>> 13     6      d  0.021761492
>> 14     9      d -0.099820335
>> 19     3      c  0.119559235
>> 20     6      c -0.054456186
>> 21     9      c  0.078811180
>> 26     3      d  0.091667189
>> 27     6      d -0.053400777
>> 28     9      d  0.090754363
>> 33     3      c  0.163818085
>> 34     6      c  0.008959741
>> 35     9      c -0.115410852
>> 40     3      d  0.193920693
>> 41     6      d -0.087738914
>> 42     9      d  0.004987542
>> 47     3      a  0.121332285
>> 48     6      a -0.020202707
>> 49     9      a  0.037295785
>> 54     3      b  0.214304603
>> 55     6      b -0.052346480
>> 56     9      b  0.082501222
>> 61     3      a  0.053540767
>> 62     6      a -0.019182819
>> 63     9      a -0.057629113
>> 68     3      b  0.068592791
>> 69     6      b -0.123298216
>> 70     9      b -0.230671818
>> 75     3      a  0.330741562
>> 76     6      a  0.013902905
>> 77     9      a  0.190620360
>> 82     3      b  0.151002874
>> 83     6      b  0.086177696
>> 84     9      b  0.178982656
>> 89     3      e  0.062974799
>> 90     6      e  0.062035391
>> 91     9      e  0.206200831
>> 96     3      e  0.123102197
>> 97     6      e  0.040181790
>> 98     9      e  0.121332285
>> 103    3      e  0.147557564
>> 104    6      e  0.062035391
>> 105    9      e  0.144965770
>>
>> [[alternative HTML version deleted]]
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...