linearHypothesis

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

linearHypothesis

Johan Lassen
Dear R-users,

I am using the R-function "linearHypothesis" to test if the sum of all
parameters, but the intercept, in a multiple linear regression is different
from zero.
I wonder if it is statistically valid to use the  linearHypothesis-function
for this?
Below is a reproducible example in R. A multiple regression: y =
beta0*t0+beta1*t1+beta2*t2+beta3*t3+beta4*t4

It seems to me that the linearHypothesis function does the calculation as
an F-test on the extra residuals when going from the starting model to a
'subset' model, although all variables in the 'subset' model differ from
the variables in the starting model.
I normally think of a subset model as a model built on the same input data
as the starting model but one variable.

Hence, is this a valid calculation?

Thanks in advance,Johan

# R-code:
y <-
c(101133190,96663050,106866486,97678429,83212348,75719714,77861937,74018478,82181104,68667176,64599495,62414401,63534709,58571865,65222727,60139788,
63355011,57790610,55214971,55535484,55759192,49450719,48834699,51383864,51250871,50629835,52154608,54636478,54942637)

data <-
data.frame(y,"t0"=1,"t1"=1990:2018,"t2"=c(rep(0,12),1:17),"t3"=c(rep(0,17),1:12),"t4"=c(rep(0,23),1:6))

model <- lm(y~t0+t1+t2+t3+t4+0,data=data)

linearHypothesis(model,"t1+t2+t3+t4=0",test=c("F"))

# Reproduce the result from linearHypothesis:
# beta1+beta2+beta3+beta4=0 -> beta4=-(beta1+beta2+beta3) ->
# y=beta0+beta1*t1+beta2*t2+beta3*t3-(beta1+beta2+beta3)*t4
# y = beta0'+beta1'*(t1-t4)+beta2'*(t2-t4)+beta3'*(t3-t4)

data$t1 <- data$t1-data$t4
data$t2 <- data$t2-data$t4
data$t3 <- data$t3-data$t4

model_reduced <- lm(y~t0+t1+t2+t3+0,data=data)

anova(model_reduced,model)

--
Johan Lassen

"In the cities people live in time -
in the mountains people live in space" (Budistisk munk).

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: linearHypothesis

Fox, John
Dear Johan,

On 2020-09-17 9:07 a.m., Johan Lassen wrote:
> Dear R-users,
>
> I am using the R-function "linearHypothesis" to test if the sum of all
> parameters, but the intercept, in a multiple linear regression is different
> from zero.
> I wonder if it is statistically valid to use the  linearHypothesis-function
> for this?

Yes, assuming of course that the hypothesis makes sense.


> Below is a reproducible example in R. A multiple regression: y =
> beta0*t0+beta1*t1+beta2*t2+beta3*t3+beta4*t4
>
> It seems to me that the linearHypothesis function does the calculation as
> an F-test on the extra residuals when going from the starting model to a
> 'subset' model, although all variables in the 'subset' model differ from
> the variables in the starting model.
> I normally think of a subset model as a model built on the same input data
> as the starting model but one variable.
>
> Hence, is this a valid calculation?

First, linearHypothesis() doesn't literally fit alternative models, but
rather tests the linear hypothesis directly from the coefficient
estimates and their covariance matrix. The test is standard -- look at
the references in ?linearHypothesis or most texts on linear models.

Second, formulating the hypothesis using alternative models is also
legitimate, since the second model is a restricted version of the first.

>
> Thanks in advance,Johan
>
> # R-code:
> y <-
> c(101133190,96663050,106866486,97678429,83212348,75719714,77861937,74018478,82181104,68667176,64599495,62414401,63534709,58571865,65222727,60139788,
> 63355011,57790610,55214971,55535484,55759192,49450719,48834699,51383864,51250871,50629835,52154608,54636478,54942637)
>
> data <-
> data.frame(y,"t0"=1,"t1"=1990:2018,"t2"=c(rep(0,12),1:17),"t3"=c(rep(0,17),1:12),"t4"=c(rep(0,23),1:6))
>
> model <- lm(y~t0+t1+t2+t3+t4+0,data=data)

You need not supply the constant regressor t0 explicitly and suppress
the intercept -- you'd get the same test from linearHypothesis() for
lm(y~t1+t2+t3+t4,data=data).

>
> linearHypothesis(model,"t1+t2+t3+t4=0",test=c("F"))

test = "F" is the default.

>
> # Reproduce the result from linearHypothesis:
> # beta1+beta2+beta3+beta4=0 -> beta4=-(beta1+beta2+beta3) ->
> # y=beta0+beta1*t1+beta2*t2+beta3*t3-(beta1+beta2+beta3)*t4
> # y = beta0'+beta1'*(t1-t4)+beta2'*(t2-t4)+beta3'*(t3-t4)
>
> data$t1 <- data$t1-data$t4
> data$t2 <- data$t2-data$t4
> data$t3 <- data$t3-data$t4
>
> model_reduced <- lm(y~t0+t1+t2+t3+0,data=data)
>
> anova(model_reduced,model)

Yes, this is equivalent to the test performed by linearHypothesis()
using the coefficients and their covariances from the original model.

I hope this helps,
  John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: linearHypothesis

Fox, John
Dear Johan,

It's generally a good idea to keep the conversation on r-help to allow
list members to follow it, and so I'm cc'ing this response to the list.

I hope that it's clear that car::linearHypothesis() computes the test as
a Wald test of a linear hypothesis and not as a likelihood-ratio test by
model comparison. As your example illustrates, however, the two tests
are the same for a linear model, but this is not true more generally.

As I mentioned, you can find the details in many sources, including in
Section 5.3.5 of Fox and Weisberg, An R Companion to Applied Regression,
3rd Edition, the book with which the car package is associated.

Best,
  John

On 2020-09-17 4:03 p.m., Johan Lassen wrote:

> Thank you John - highly appreciated! Yes, you are right, the less
> complex model may be seen as a restricted model of the starting model.
> Although the set of variables in the less complex model is not directly
> a subset of the variables of the starting model. What confused me at
> first was that I think of a subset model as a model having a direct
> subset of the set of variables of the starting model. Even though this
> is not the case in the example, the test still is on a restricted model
> of the starting model.
> Thanks,
> Johan
>
> Den tor. 17. sep. 2020 kl. 15.55 skrev John Fox <[hidden email]
> <mailto:[hidden email]>>:
>
>     Dear Johan,
>
>     On 2020-09-17 9:07 a.m., Johan Lassen wrote:
>      > Dear R-users,
>      >
>      > I am using the R-function "linearHypothesis" to test if the sum
>     of all
>      > parameters, but the intercept, in a multiple linear regression is
>     different
>      > from zero.
>      > I wonder if it is statistically valid to use the
>     linearHypothesis-function
>      > for this?
>
>     Yes, assuming of course that the hypothesis makes sense.
>
>
>      > Below is a reproducible example in R. A multiple regression: y =
>      > beta0*t0+beta1*t1+beta2*t2+beta3*t3+beta4*t4
>      >
>      > It seems to me that the linearHypothesis function does the
>     calculation as
>      > an F-test on the extra residuals when going from the starting
>     model to a
>      > 'subset' model, although all variables in the 'subset' model
>     differ from
>      > the variables in the starting model.
>      > I normally think of a subset model as a model built on the same
>     input data
>      > as the starting model but one variable.
>      >
>      > Hence, is this a valid calculation?
>
>     First, linearHypothesis() doesn't literally fit alternative models, but
>     rather tests the linear hypothesis directly from the coefficient
>     estimates and their covariance matrix. The test is standard -- look at
>     the references in ?linearHypothesis or most texts on linear models.
>
>     Second, formulating the hypothesis using alternative models is also
>     legitimate, since the second model is a restricted version of the first.
>
>      >
>      > Thanks in advance,Johan
>      >
>      > # R-code:
>      > y <-
>      >
>     c(101133190,96663050,106866486,97678429,83212348,75719714,77861937,74018478,82181104,68667176,64599495,62414401,63534709,58571865,65222727,60139788,
>      >
>     63355011,57790610,55214971,55535484,55759192,49450719,48834699,51383864,51250871,50629835,52154608,54636478,54942637)
>      >
>      > data <-
>      >
>     data.frame(y,"t0"=1,"t1"=1990:2018,"t2"=c(rep(0,12),1:17),"t3"=c(rep(0,17),1:12),"t4"=c(rep(0,23),1:6))
>      >
>      > model <- lm(y~t0+t1+t2+t3+t4+0,data=data)
>
>     You need not supply the constant regressor t0 explicitly and suppress
>     the intercept -- you'd get the same test from linearHypothesis() for
>     lm(y~t1+t2+t3+t4,data=data).
>
>      >
>      > linearHypothesis(model,"t1+t2+t3+t4=0",test=c("F"))
>
>     test = "F" is the default.
>
>      >
>      > # Reproduce the result from linearHypothesis:
>      > # beta1+beta2+beta3+beta4=0 -> beta4=-(beta1+beta2+beta3) ->
>      > # y=beta0+beta1*t1+beta2*t2+beta3*t3-(beta1+beta2+beta3)*t4
>      > # y = beta0'+beta1'*(t1-t4)+beta2'*(t2-t4)+beta3'*(t3-t4)
>      >
>      > data$t1 <- data$t1-data$t4
>      > data$t2 <- data$t2-data$t4
>      > data$t3 <- data$t3-data$t4
>      >
>      > model_reduced <- lm(y~t0+t1+t2+t3+0,data=data)
>      >
>      > anova(model_reduced,model)
>
>     Yes, this is equivalent to the test performed by linearHypothesis()
>     using the coefficients and their covariances from the original model.
>
>     I hope this helps,
>        John
>
>     --
>     John Fox, Professor Emeritus
>     McMaster University
>     Hamilton, Ontario, Canada
>     web: https://socialsciences.mcmaster.ca/jfox/
>      >
>
>
>
> --
> Johan Lassen
>
> "In the cities people live in time -
> in the mountains people live in space" (Budistisk munk).

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.