Relevel() catagorical variables in a GLM

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Relevel() catagorical variables in a GLM

ashley
Hello list readers,

I am running a set of GLMs on fish spp presence/absence as a function of various habitat characteristics. My response is binomial and I have four predictors, three of which are categorical.

So, R takes one of my predictor-variables away to use as the intercept (the first one alphabetically). However, I want to know the coefficient and SE of this predictor. I tried relevel() and reran the model. Abbreviated summary() results for each run are below. The results seem drastically different. Have I done the wrong thing?

(Below is a result from the model with only one predictor, to save space and hassle.)

Thanks,
Ashley

#Default reference level = HH:

                                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)                     -5.2671     0.2781 -18.942   <2e-16 ***
raw.table$SubsComboHS    0.8127     0.6438   1.262    0.207    
raw.table$SubsComboSH  -0.5736     1.0393  -0.552    0.581    
raw.table$SubsComboSS -18.2990   923.6023  -0.020    0.984  

#Command used to change reference level:
> raw.table$SubsCombo<-relevel(raw.table$SubsCombo, ref="SS")

#New reference level = SS:

                                   Estimate Std. Error z value Pr(>|z|)
(Intercept)                     -23.57     923.60  -0.026    0.980
raw.table$SubsComboHH    18.30     923.60   0.020    0.984
raw.table$SubsComboHS    19.11     923.60   0.021    0.983
raw.table$SubsComboSH    17.73     923.60   0.019    0.985

Reply | Threaded
Open this post in threaded view
|

Re: Relevel() catagorical variables in a GLM

Joshua Wiley-2
Hi Ashley,

It does not look like you have done the wrong thing to me.  The
results will be different because eacho f the parameter estimates is
now the change from SS to ___ instead of from HH to ____.  In fact,
from your first table, you can calculate all the parameters in the
second.  The intercept for SS as reference is:

(-5.2671) + (-18.2990) = -23.5661

the difference between SH and SS is:
> (-0.5736) - (-18.2990)
[1] 17.7254

which is now the parameter estimate for SH in the SS as reference
model.  You could go on in like fashion for the rest.

HTH,

Josh

On Sat, May 28, 2011 at 4:27 PM, ashley <[hidden email]> wrote:

> Hello list readers,
>
> I am running a set of GLMs on fish spp presence/absence as a function of
> various habitat characteristics. My response is binomial and I have four
> predictors, three of which are categorical.
>
> So, R takes one of my predictor-variables away to use as the intercept (the
> first one alphabetically). However, I want to know the coefficient and SE of
> this predictor. I tried relevel() and reran the model. Abbreviated summary()
> results for each run are below. The results seem drastically different. Have
> I done the wrong thing?
>
> (Below is a result from the model with only one predictor, to save space and
> hassle.)
>
> Thanks,
> Ashley
>
> #Default reference level = HH:
>
>                                 Estimate Std. Error z value Pr(>|z|)
> (Intercept)                     -5.2671     0.2781 -18.942   <2e-16 ***
> raw.table$SubsComboHS    0.8127     0.6438   1.262    0.207
> raw.table$SubsComboSH  -0.5736     1.0393  -0.552    0.581
> raw.table$SubsComboSS -18.2990   923.6023  -0.020    0.984
>
> #Command used to change reference level:
>> raw.table$SubsCombo<-relevel(raw.table$SubsCombo, ref="SS")
>
> #New reference level = SS:
>
>                                   Estimate Std. Error z value Pr(>|z|)
> (Intercept)                     -23.57     923.60  -0.026    0.980
> raw.table$SubsComboHH    18.30     923.60   0.020    0.984
> raw.table$SubsComboHS    19.11     923.60   0.021    0.983
> raw.table$SubsComboSH    17.73     923.60   0.019    0.985
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Relevel-catagorical-variables-in-a-GLM-tp3558181p3558181.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Relevel() catagorical variables in a GLM

ashley
Thanks Josh,

This makes sense. The coefficient of one parameter is given in reference to another parameter. (Aha! The "reference" parameter.)

I'm still a little confused on the standard errors (SEs) and why they change too? Should I change the reference around until I find the best-looking SEs? Somehow that doesn't seem right...

 I am trying to eventually make statements like:
"Fish A showed a weak, yet positive, response to HS and the low SE gives us confidence in this association"
"Fish A showed a weak negative response to SH and the low SE gives us confidence in this association (though less confidence than for HS)"
"Fish A showed a strong negative response to SS, however the SE is very high so we cannot say this with high certainty"

Do these sound like an accurate reflection of what the output is saying? (I know that "low SE" is arguable but...)

So then the SE for HH 923.60 (2nd run) or 0.2781 (1st run)?

Thank you!
Ashley


On Sat, May 28, 2011 at 11:49 PM, Joshua Wiley-2 [via R] <[hidden email]> wrote:
Hi Ashley,

It does not look like you have done the wrong thing to me.  The
results will be different because eacho f the parameter estimates is
now the change from SS to ___ instead of from HH to ____.  In fact,
from your first table, you can calculate all the parameters in the
second.  The intercept for SS as reference is:

(-5.2671) + (-18.2990) = -23.5661

the difference between SH and SS is:
> (-0.5736) - (-18.2990)
[1] 17.7254

which is now the parameter estimate for SH in the SS as reference
model.  You could go on in like fashion for the rest.

HTH,

Josh

On Sat, May 28, 2011 at 4:27 PM, ashley <[hidden email]> wrote:

> Hello list readers,
>
> I am running a set of GLMs on fish spp presence/absence as a function of
> various habitat characteristics. My response is binomial and I have four
> predictors, three of which are categorical.
>
> So, R takes one of my predictor-variables away to use as the intercept (the
> first one alphabetically). However, I want to know the coefficient and SE of
> this predictor. I tried relevel() and reran the model. Abbreviated summary()
> results for each run are below. The results seem drastically different. Have
> I done the wrong thing?
>
> (Below is a result from the model with only one predictor, to save space and
> hassle.)
>
> Thanks,
> Ashley
>
> #Default reference level = HH:
>
>                                 Estimate Std. Error z value Pr(>|z|)
> (Intercept)                     -5.2671     0.2781 -18.942   <2e-16 ***
> raw.table$SubsComboHS    0.8127     0.6438   1.262    0.207
> raw.table$SubsComboSH  -0.5736     1.0393  -0.552    0.581
> raw.table$SubsComboSS -18.2990   923.6023  -0.020    0.984
>
> #Command used to change reference level:
>> raw.table$SubsCombo<-relevel(raw.table$SubsCombo, ref="SS")
>
> #New reference level = SS:
>
>                                   Estimate Std. Error z value Pr(>|z|)
> (Intercept)                     -23.57     923.60  -0.026    0.980
> raw.table$SubsComboHH    18.30     923.60   0.020    0.984
> raw.table$SubsComboHS    19.11     923.60   0.021    0.983
> raw.table$SubsComboSH    17.73     923.60   0.019    0.985
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Relevel-catagorical-variables-in-a-GLM-tp3558181p3558181.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/Relevel-catagorical-variables-in-a-GLM-tp3558181p3558477.html
To unsubscribe from Relevel() catagorical variables in a GLM, click here.



--
**Please note new extension
_____________________________________________


Ashley Knight
Rote Program Assistant
Research Assistant
Institute for Applied Marine Ecology, CSU Monterey Bay
Chapman Science Academic Center (Bldg 53)
100 Campus Center, Seaside, CA 93950

(831) 582-4522
[hidden email]
http://sep.csumb.edu/ifame
_____________________________________________

Reply | Threaded
Open this post in threaded view
|

Re: Relevel() catagorical variables in a GLM

David Winsemius

On May 29, 2011, at 5:10 AM, ashley wrote:

> Thanks Josh,
>
> This makes sense. The coefficient of one parameter is given in  
> reference to
> another parameter. (Aha! The "reference" parameter.)
>
> I'm still a little confused on the standard errors (SEs) and why  
> they change
> too? Should I change the reference around until I find the best-
> looking SEs?
> Somehow that doesn't seem right...

If you choose a reference level that only has a few members, then all  
the comparisons will reflect that data situation, i.e. the standard  
errors for any contrast with a groups with scanty numbers will be large.
>
> I am trying to eventually make statements like:
> "Fish A showed a weak, yet positive, response to HS and the low SE  
> gives us
> confidence in this association"

I didn't see any contrasts in your models that merited that  
representation. In general, you should be using likelihood ratio tests  
for inference, rather than Wald tests. The example you provide is one  
good illustration why that is so.

--
David.

> "Fish A showed a weak negative response to SH and the low SE gives us
> confidence in this association (though less confidence than for HS)"
> "Fish A showed a strong negative response to SS, however the SE is  
> very high
> so we cannot say this with high certainty"
>
> Do these sound like an accurate reflection of what the output is  
> saying? (I
> know that "low SE" is arguable but...)
>
> So then the SE for HH 923.60 (2nd run) or 0.2781 (1st run)?
>
> Thank you!
> Ashley
>
>
> On Sat, May 28, 2011 at 11:49 PM, Joshua Wiley-2 [via R] <
> [hidden email]> wrote:
>
>> Hi Ashley,
>>
>> It does not look like you have done the wrong thing to me.  The
>> results will be different because eacho f the parameter estimates is
>> now the change from SS to ___ instead of from HH to ____.  In fact,
>> from your first table, you can calculate all the parameters in the
>> second.  The intercept for SS as reference is:
>>
>> (-5.2671) + (-18.2990) = -23.5661
>>
>> the difference between SH and SS is:
>>> (-0.5736) - (-18.2990)
>> [1] 17.7254
>>
>> which is now the parameter estimate for SH in the SS as reference
>> model.  You could go on in like fashion for the rest.
>>
>> HTH,
>>
>> Josh
>>
>> On Sat, May 28, 2011 at 4:27 PM, ashley <[hidden email]<http://user/SendEmail.jtp?type=node&node=3558477&i=0 
>> >>
>> wrote:
>>
>>> Hello list readers,
>>>
>>> I am running a set of GLMs on fish spp presence/absence as a  
>>> function of
>>> various habitat characteristics. My response is binomial and I  
>>> have four
>>> predictors, three of which are categorical.
>>>
>>> So, R takes one of my predictor-variables away to use as the  
>>> intercept
>> (the
>>> first one alphabetically). However, I want to know the coefficient  
>>> and SE
>> of
>>> this predictor. I tried relevel() and reran the model. Abbreviated
>> summary()
>>> results for each run are below. The results seem drastically  
>>> different.
>> Have
>>> I done the wrong thing?
>>>
>>> (Below is a result from the model with only one predictor, to save  
>>> space
>> and
>>> hassle.)
>>>
>>> Thanks,
>>> Ashley
>>>
>>> #Default reference level = HH:
>>>
>>>                                Estimate Std. Error z value Pr(>|z|)
>>> (Intercept)                     -5.2671     0.2781 -18.942    
>>> <2e-16 ***
>>> raw.table$SubsComboHS    0.8127     0.6438   1.262    0.207
>>> raw.table$SubsComboSH  -0.5736     1.0393  -0.552    0.581
>>> raw.table$SubsComboSS -18.2990   923.6023  -0.020    0.984
>>>
>>> #Command used to change reference level:
>>>> raw.table$SubsCombo<-relevel(raw.table$SubsCombo, ref="SS")
>>>
>>> #New reference level = SS:
>>>
>>>                                  Estimate Std. Error z value Pr(>|
>>> z|)
>>> (Intercept)                     -23.57     923.60  -0.026    0.980
>>> raw.table$SubsComboHH    18.30     923.60   0.020    0.984
>>> raw.table$SubsComboHS    19.11     923.60   0.021    0.983
>>> raw.table$SubsComboSH    17.73     923.60   0.019    0.985
>>>
>>>
>>>
>>> --
>>> View this message in context:
>> http://r.789695.n4.nabble.com/Relevel-catagorical-variables-in-a-GLM-tp3558181p3558181.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> [hidden email] <http://user/SendEmail.jtp?
>>> type=node&node=3558477&i=1>mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Joshua Wiley
>> Ph.D. Student, Health Psychology
>> University of California, Los Angeles
>> http://www.joshuawiley.com/
>>
>> ______________________________________________
>> [hidden email] <http://user/SendEmail.jtp?
>> type=node&node=3558477&i=2>mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> ------------------------------
>> If you reply to this email, your message will be added to the  
>> discussion
>> below:
>>
>> http://r.789695.n4.nabble.com/Relevel-catagorical-variables-in-a-GLM-tp3558181p3558477.html
>> To unsubscribe from Relevel() catagorical variables in a GLM, click  
>> here<
>> >.
>>
>>
>
>
>
> --
> **Please note new extension
> _____________________________________________
>
> Ashley Knight
> Rote Program Assistant
> Research Assistant
> Institute for Applied Marine Ecology, CSU Monterey Bay
> Chapman Science Academic Center (Bldg 53)
> 100 Campus Center, Seaside, CA 93950
>
> (831) 582-4522
>
[hidden email]
> http://sep.csumb.edu/ifame
> _____________________________________________
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Relevel-catagorical-variables-in-a-GLM-tp3558181p3558586.html
> Sent from the R help mailing list archive at Nabble.com.
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.