Don´t know what test i have to use

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Don´t know what test i have to use

gaiarrido
Hello,
I´m starting with my PhD and I have to stop because i got a little knowledge in R and statistics.
I´ve got a model of this kind:
binary response variable: prevalence of infection (0/1)
3 categorical independent variables: sex, month and name of the area  

I was trying with a full model like this, before the simplification

model<-aov(prevalencia~sex*month*area)

but the Fligner test told that i haven´t got homoscedascity, so I suppose I should trying with glm, with a model

model2<-glm(prevalencia~edad*sexo*mes*zona,binomial)

is that correct? where I must put the link (logit) ?

Thnks very much
Mario Garrido Escudero
PhD student
Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola
Universidad de Salamanca
Reply | Threaded
Open this post in threaded view
|

Re: Don´t know what test i have to use

David Winsemius

On Jan 12, 2011, at 12:51 PM, gaiarrido wrote:

>
> Hello,
> I´m starting with my PhD and I have to stop because i got a little  
> knowledge
> in R and statistics.
> I´ve got a model of this kind:
> binary response variable: prevalence of infection (0/1)
> 3 categorical independent variables: sex, month and name of the area
>
> I was trying with a full model like this, before the simplification
>
> model<-aov(prevalencia~sex*month*area)
>
> but the Fligner test told that i haven´t got homoscedascity, so I  
> suppose I
> should trying with glm, with a model
>
> model2<-glm(prevalencia~edad*sexo*mes*zona,binomial)
>
> is that correct? where I must put the link (logit) ?

Why not read the help page regarding binomial that is on the help page  
for glm. There you will learn what the default link is for binomial.

--
David

>
> Thnks very much
> --
> View this message in context: http://r.789695.n4.nabble.com/Don-t-know-what-test-i-have-to-use-tp3214491p3214491.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Don´t know what test i have to use

Joshua Wiley-2
In reply to this post by gaiarrido
Hi,

That is basically correct.  You can specify the link as logit (see my
example), but that is the default so you do not strictly need to in
this case.  II would encourage you to keep your variables
(prevalencia, edad, sexo, mes) stored in a data frame, in which case
you would add the data = argument to glm().

model2 <- glm(prevalencia ~ edad * sexo * mes * zona,
  family = binomial(link = "logit"),
  data = your_dataframe)

Also, you might take a look at ?predict.glm  it has some examples with
binomial data based off the wonderful book by Drs. Venables and
Ripley.  Oh, and finally, if you have 12 levels of months, ? levels of
zones, and 2 levels of sex, you might not want the 4way interactions
that you will get by default from using the '*' operator inside a
formula.  Unless you have a theory that there is an additional effect
of being a middle aged female in the month of July for zone 8, but
not....

Cheers,

Josh

On Wed, Jan 12, 2011 at 9:51 AM, gaiarrido <[hidden email]> wrote:

>
> Hello,
> I´m starting with my PhD and I have to stop because i got a little knowledge
> in R and statistics.
> I´ve got a model of this kind:
> binary response variable: prevalence of infection (0/1)
> 3 categorical independent variables: sex, month and name of the area
>
> I was trying with a full model like this, before the simplification
>
> model<-aov(prevalencia~sex*month*area)
>
> but the Fligner test told that i haven´t got homoscedascity, so I suppose I
> should trying with glm, with a model
>
> model2<-glm(prevalencia~edad*sexo*mes*zona,binomial)
>
> is that correct? where I must put the link (logit) ?
>
> Thnks very much
> --
> View this message in context: http://r.789695.n4.nabble.com/Don-t-know-what-test-i-have-to-use-tp3214491p3214491.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Don´t know what test i have to use

gaiarrido
Thanks very much both.
I´m starting playing with it, i was a little afaid because it was part of my job, but now i've found it very funny.
Josh, I've got just data for 3 representatives months, and it's not a priori rejectable that could be differences in  the ratio of changes along the months between the 2 sexes.

Thanks again
Mario Garrido Escudero
PhD student
Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola
Universidad de Salamanca
Reply | Threaded
Open this post in threaded view
|

Re: Don´t know what test i have to use

Bert Gunter
In reply to this post by Joshua Wiley-2
... But I would think that month should be treated as a cyclical
quantity, not as a factor with 12 independent levels, e.g. by
transforming month to  sin( 2*pi*monthNumber/12) .  This assumes 1
year periodicity, which might not be right, of course. Time series
methods could obviously be relevant here. Given the possible
importance of such periodicity and the relative complexity of the
methodology necessary to deal with it properly, you might benefit by
consulting your local statistician for help.

-- Bert

On Wed, Jan 12, 2011 at 10:43 AM, Joshua Wiley <[hidden email]> wrote:

> Hi,
>
> That is basically correct.  You can specify the link as logit (see my
> example), but that is the default so you do not strictly need to in
> this case.  II would encourage you to keep your variables
> (prevalencia, edad, sexo, mes) stored in a data frame, in which case
> you would add the data = argument to glm().
>
> model2 <- glm(prevalencia ~ edad * sexo * mes * zona,
>  family = binomial(link = "logit"),
>  data = your_dataframe)
>
> Also, you might take a look at ?predict.glm  it has some examples with
> binomial data based off the wonderful book by Drs. Venables and
> Ripley.  Oh, and finally, if you have 12 levels of months, ? levels of
> zones, and 2 levels of sex, you might not want the 4way interactions
> that you will get by default from using the '*' operator inside a
> formula.  Unless you have a theory that there is an additional effect
> of being a middle aged female in the month of July for zone 8, but
> not....
>
> Cheers,
>
> Josh
>
> On Wed, Jan 12, 2011 at 9:51 AM, gaiarrido <[hidden email]> wrote:
>>
>> Hello,
>> I´m starting with my PhD and I have to stop because i got a little knowledge
>> in R and statistics.
>> I´ve got a model of this kind:
>> binary response variable: prevalence of infection (0/1)
>> 3 categorical independent variables: sex, month and name of the area
>>
>> I was trying with a full model like this, before the simplification
>>
>> model<-aov(prevalencia~sex*month*area)
>>
>> but the Fligner test told that i haven´t got homoscedascity, so I suppose I
>> should trying with glm, with a model
>>
>> model2<-glm(prevalencia~edad*sexo*mes*zona,binomial)
>>
>> is that correct? where I must put the link (logit) ?
>>
>> Thnks very much
>> --
>> View this message in context: http://r.789695.n4.nabble.com/Don-t-know-what-test-i-have-to-use-tp3214491p3214491.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://www.joshuawiley.com/
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Bert Gunter
Genentech Nonclinical Biostatistics

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Don´t know what test i have to use

gaiarrido
Thanks for the advice, I will use next years. Till know i´ve just got data for 3 independent months and one of the months it´s the joining for all the summer because of the small sample size, so, I suppose, I can't use it in the way you say.
Mario Garrido Escudero
PhD student
Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola
Universidad de Salamanca