Multivariable Wald to test equality of multinomial coefficients

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Multivariable Wald to test equality of multinomial coefficients

pgseye
Hi,

I am trying to replicate a test in the Hosmer - Applied Logistic regression text (pp 289, 3rd ed) that uses a Multivariable Wald test to test the equality of coefficients across the 2 logits of a 3 category response multinomial model. I’d like to see whether (from a  statistical standpoint) it is acceptable to collapse the 2 response categories and then simply use a binary logistic regression. The idea is that if the coefficients across the 2 logits are similar (non-significant p value with Wald test), then it is reasonable to pool the categories.

There does not seem to be a built in way to do this in R?

Using the mtcars dataset as an example (for the sake of the example, using cyl as a 3-factor response), does anyone have any ideas how to do this

library(nnet)
data(mtcars)
mtcars$cyl <- as.factor(mtcars$cyl)  
mtcars$am <- as.factor(mtcars$am)  
mod <- multinom(cyl ~ am + hp, data=mtcars)
summary(mod)

> summary(mod)
Call:
multinom(formula = cyl ~ am + hp, data = mtcars)

Coefficients:
  (Intercept)       am1        hp
6   -42.03847  -3.77398 0.4147498
8   -92.30944 -26.27554 0.7836576

So, I want to simultaneously test whether the 3 coefficients across the 2 logits are similar.

Thank you.



        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Multivariable Wald to test equality of multinomial coefficients

Bert Gunter-2
See inline.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Oct 3, 2016 at 3:30 PM, Paul Sanfilippo <[hidden email]> wrote:
> Hi,
>
> I am trying to replicate a test in the Hosmer - Applied Logistic regression text (pp 289, 3rd ed) that uses a Multivariable Wald test to test the equality of coefficients across the 2 logits of a 3 category response multinomial model. I’d like to see whether (from a  statistical standpoint) it is acceptable to collapse the 2 response categories and then simply use a binary logistic regression.


"The idea is that if the coefficients across the 2 logits are similar
(non-significant p value with Wald test), then it is reasonable to
pool the categories."

IMHO, this is a bad idea. See
http://www.nature.com/news/statisticians-issue-warning-over-misuse-of-p-values-1.19503.

Significance or lack of it is not a legitimate criterion on which to
base scientific decisions.



>
> There does not seem to be a built in way to do this in R?
>
> Using the mtcars dataset as an example (for the sake of the example, using cyl as a 3-factor response), does anyone have any ideas how to do this
>
> library(nnet)
> data(mtcars)
> mtcars$cyl <- as.factor(mtcars$cyl)
> mtcars$am <- as.factor(mtcars$am)
> mod <- multinom(cyl ~ am + hp, data=mtcars)
> summary(mod)
>
>> summary(mod)
> Call:
> multinom(formula = cyl ~ am + hp, data = mtcars)
>
> Coefficients:
>   (Intercept)       am1        hp
> 6   -42.03847  -3.77398 0.4147498
> 8   -92.30944 -26.27554 0.7836576
>
> So, I want to simultaneously test whether the 3 coefficients across the 2 logits are similar.
>
> Thank you.
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Multivariable Wald to test equality of multinomial coefficients

pgseye
Thanks Bert,

I realise that and am not distilling it down to the p value. I am primarily considering the issue of collapsing down in the larger context of the added value/information that the extra categories give. However, I am curious to see (as I said from a statistical standpoint) whether the coefficients are sufficiently dissimilar.

Regards,

Paul Sanfilippo 



On 4 October 2016 at 10:08:12 am, Bert Gunter ([hidden email]) wrote:

See inline.  

-- Bert  

Bert Gunter  

"The trouble with having an open mind is that people keep coming along  
and sticking things into it."  
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )  


On Mon, Oct 3, 2016 at 3:30 PM, Paul Sanfilippo <[hidden email]> wrote:  
> Hi,  
>  
> I am trying to replicate a test in the Hosmer - Applied Logistic regression text (pp 289, 3rd ed) that uses a Multivariable Wald test to test the equality of coefficients across the 2 logits of a 3 category response multinomial model. I’d like to see whether (from a statistical standpoint) it is acceptable to collapse the 2 response categories and then simply use a binary logistic regression.  


"The idea is that if the coefficients across the 2 logits are similar  
(non-significant p value with Wald test), then it is reasonable to  
pool the categories."  

IMHO, this is a bad idea. See  
http://www.nature.com/news/statisticians-issue-warning-over-misuse-of-p-values-1.19503.  

Significance or lack of it is not a legitimate criterion on which to  
base scientific decisions.  



>  
> There does not seem to be a built in way to do this in R?  
>  
> Using the mtcars dataset as an example (for the sake of the example, using cyl as a 3-factor response), does anyone have any ideas how to do this  
>  
> library(nnet)  
> data(mtcars)  
> mtcars$cyl <- as.factor(mtcars$cyl)  
> mtcars$am <- as.factor(mtcars$am)  
> mod <- multinom(cyl ~ am + hp, data=mtcars)  
> summary(mod)  
>  
>> summary(mod)  
> Call:  
> multinom(formula = cyl ~ am + hp, data = mtcars)  
>  
> Coefficients:  
> (Intercept) am1 hp  
> 6 -42.03847 -3.77398 0.4147498  
> 8 -92.30944 -26.27554 0.7836576  
>  
> So, I want to simultaneously test whether the 3 coefficients across the 2 logits are similar.  
>  
> Thank you.  
>  
>  
>  
> [[alternative HTML version deleted]]  
>  
> ______________________________________________  
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see  
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.  

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Multivariable Wald to test equality of multinomial coefficients

Peter Dalgaard-2
In reply to this post by pgseye

> On 04 Oct 2016, at 00:30 , Paul Sanfilippo <[hidden email]> wrote:
>
> Hi,
>
> I am trying to replicate a test in the Hosmer - Applied Logistic regression text (pp 289, 3rd ed) that uses a Multivariable Wald test to test the equality of coefficients across the 2 logits of a 3 category response multinomial model. I’d like to see whether (from a  statistical standpoint) it is acceptable to collapse the 2 response categories and then simply use a binary logistic regression. The idea is that if the coefficients across the 2 logits are similar (non-significant p value with Wald test), then it is reasonable to pool the categories.
>
> There does not seem to be a built in way to do this in R?
>
> Using the mtcars dataset as an example (for the sake of the example, using cyl as a 3-factor response), does anyone have any ideas how to do this
>
> library(nnet)
> data(mtcars)
> mtcars$cyl <- as.factor(mtcars$cyl)  
> mtcars$am <- as.factor(mtcars$am)  
> mod <- multinom(cyl ~ am + hp, data=mtcars)
> summary(mod)
>
>> summary(mod)
> Call:
> multinom(formula = cyl ~ am + hp, data = mtcars)
>
> Coefficients:
>   (Intercept)       am1        hp
> 6   -42.03847  -3.77398 0.4147498
> 8   -92.30944 -26.27554 0.7836576
>
> So, I want to simultaneously test whether the 3 coefficients across the 2 logits are similar.

R and R-packages do not always produce every single test that someone have thought up. Sometimes, you just get the tools to roll your own, and are expected to know enough theory to do so.

The generic technique for a Wald test would be to

(a) extract the variance-covariance matrix (V) of the coefficients (beta)
(b) write the linear relation that you wish to test in matrix form A beta = 0
(c) the variance-covariance matrix of A beta will be A V A'
(d) the Wald test is (A beta)' (A V A')^{-1} (A beta)

For the case of multinom(), a little extra care is needed because it presents the coefficients as a matrix, and the variance covariance matrix is ordered as if the coefficients were organized as a vector _by row_:

> vcov(mod)
              6:(Intercept)       6:am1        6:hp 8:(Intercept)        8:am1
6:(Intercept)    771.682250  69.0782649 -7.61945359    168.212743  -463.676388
6:am1             69.078265  10.6015542 -0.71221686     13.069175   -38.850954
6:hp              -7.619454  -0.7122169  0.07550636     -1.647338     4.560144
8:(Intercept)    168.212743  13.0691754 -1.64733776   1019.860307 -1473.837691
8:am1           -463.676388 -38.8509537  4.56014436  -1473.837691  2195.306673
8:hp              -3.169803  -0.2992327  0.03147124     -7.719801    11.719345
                     8:hp
6:(Intercept) -3.16980299
6:am1         -0.29923273
6:hp           0.03147124
8:(Intercept) -7.71980097
8:am1         11.71934471
8:hp           0.06548745
> coef(mod)
  (Intercept)       am1        hp
6   -42.03847  -3.77398 0.4147498
8   -92.30944 -26.27554 0.7836576

So the thing to do would be roughly like this (the code can surely be improved):

> beta <- as.vector(t(coef(mod)))
> A  <- rbind(c(1,0,0,-1,0,0), c(0,1,0,0,-1,0), c(0,0,1,0,0,-1))
> A
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    0    0   -1    0    0
[2,]    0    1    0    0   -1    0
[3,]    0    0    1    0    0   -1
> A %*% beta
           [,1]
[1,] 50.2709610
[2,] 22.5015565
[3,] -0.3689079
> t(A %*% beta) %*% solve(A %*% vcov(mod) %*% t(A), A %*% beta)
         [,1]
[1,] 3.592326

And then get the p-value from the asymptotic chi-square on 3-df

> pchisq(3.59, 3, lower=FALSE)
[1] 0.3092756

 

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Multivariable Wald to test equality of multinomial coefficients

pgseye
Thank you very much Peter - very enlightening to do it from first principles!

Regards,

Paul



On 4 October 2016 at 10:23:52 am, peter dalgaard ([hidden email]) wrote:


> On 04 Oct 2016, at 00:30 , Paul Sanfilippo <[hidden email]> wrote:  
>  
> Hi,  
>  
> I am trying to replicate a test in the Hosmer - Applied Logistic regression text (pp 289, 3rd ed) that uses a Multivariable Wald test to test the equality of coefficients across the 2 logits of a 3 category response multinomial model. I’d like to see whether (from a statistical standpoint) it is acceptable to collapse the 2 response categories and then simply use a binary logistic regression. The idea is that if the coefficients across the 2 logits are similar (non-significant p value with Wald test), then it is reasonable to pool the categories.  
>  
> There does not seem to be a built in way to do this in R?  
>  
> Using the mtcars dataset as an example (for the sake of the example, using cyl as a 3-factor response), does anyone have any ideas how to do this  
>  
> library(nnet)  
> data(mtcars)  
> mtcars$cyl <- as.factor(mtcars$cyl)  
> mtcars$am <- as.factor(mtcars$am)  
> mod <- multinom(cyl ~ am + hp, data=mtcars)  
> summary(mod)  
>  
>> summary(mod)  
> Call:  
> multinom(formula = cyl ~ am + hp, data = mtcars)  
>  
> Coefficients:  
> (Intercept) am1 hp  
> 6 -42.03847 -3.77398 0.4147498  
> 8 -92.30944 -26.27554 0.7836576  
>  
> So, I want to simultaneously test whether the 3 coefficients across the 2 logits are similar.  

R and R-packages do not always produce every single test that someone have thought up. Sometimes, you just get the tools to roll your own, and are expected to know enough theory to do so.  

The generic technique for a Wald test would be to  

(a) extract the variance-covariance matrix (V) of the coefficients (beta)  
(b) write the linear relation that you wish to test in matrix form A beta = 0  
(c) the variance-covariance matrix of A beta will be A V A'  
(d) the Wald test is (A beta)' (A V A')^{-1} (A beta)  

For the case of multinom(), a little extra care is needed because it presents the coefficients as a matrix, and the variance covariance matrix is ordered as if the coefficients were organized as a vector _by row_:  

> vcov(mod)  
6:(Intercept) 6:am1 6:hp 8:(Intercept) 8:am1  
6:(Intercept) 771.682250 69.0782649 -7.61945359 168.212743 -463.676388  
6:am1 69.078265 10.6015542 -0.71221686 13.069175 -38.850954  
6:hp -7.619454 -0.7122169 0.07550636 -1.647338 4.560144  
8:(Intercept) 168.212743 13.0691754 -1.64733776 1019.860307 -1473.837691  
8:am1 -463.676388 -38.8509537 4.56014436 -1473.837691 2195.306673  
8:hp -3.169803 -0.2992327 0.03147124 -7.719801 11.719345  
8:hp  
6:(Intercept) -3.16980299  
6:am1 -0.29923273  
6:hp 0.03147124  
8:(Intercept) -7.71980097  
8:am1 11.71934471  
8:hp 0.06548745  
> coef(mod)  
(Intercept) am1 hp  
6 -42.03847 -3.77398 0.4147498  
8 -92.30944 -26.27554 0.7836576  

So the thing to do would be roughly like this (the code can surely be improved):  

> beta <- as.vector(t(coef(mod)))  
> A <- rbind(c(1,0,0,-1,0,0), c(0,1,0,0,-1,0), c(0,0,1,0,0,-1))  
> A  
[,1] [,2] [,3] [,4] [,5] [,6]  
[1,] 1 0 0 -1 0 0  
[2,] 0 1 0 0 -1 0  
[3,] 0 0 1 0 0 -1  
> A %*% beta  
[,1]  
[1,] 50.2709610  
[2,] 22.5015565  
[3,] -0.3689079  
> t(A %*% beta) %*% solve(A %*% vcov(mod) %*% t(A), A %*% beta)  
[,1]  
[1,] 3.592326  

And then get the p-value from the asymptotic chi-square on 3-df  

> pchisq(3.59, 3, lower=FALSE)  
[1] 0.3092756  



--  
Peter Dalgaard, Professor,  
Center for Statistics, Copenhagen Business School  
Solbjerg Plads 3, 2000 Frederiksberg, Denmark  
Phone: (+45)38153501  
Office: A 4.23  
Email: [hidden email] Priv: [hidden email]  










        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.