# Two-tailed exact binomial test with binom.test and sum(dbinom(...)) Classic List Threaded 4 messages Open this post in threaded view
|

## Two-tailed exact binomial test with binom.test and sum(dbinom(...))

 Hi R experts, I have a few related questions that are actually a combination of an R and a hopefully not too  trivial (?) statistics question, namely regarding the computation of an exact two-tailed binomial test. Let's assume the following scenario: - number of trials = 10 - p of success = 0.6 (a) Let's also assume we have an H1 that there are more than 6 successes and the number of successes we get is 8. In that case, we do sum(dbinom(8:10, 10, 0.6)) # 0.1672898 binom.test(8, 10, 0.6, alternative="greater") # 0.1673 (b) Now let's assume we have an H1 that there are fewer than 6 successes and the number of successes we get is 2. In that case, we do sum(dbinom(0:2, 10, 0.6)) # 0.01229455 binom.test(2, 10, 0.6, alternative="less") # 0.01229 So far no problem. My questions are now concerned with a two-tailed test: (1). My understanding would be that, if we have an H1 that says "the number of successes won't be 6", then we can add up the two probabilities from above: sum(dbinom(8:10, 10, 0.6)) + sum(dbinom(0:2, 10, 0.6)) # 0.1795843, or just sum(dbinom(c(0:2, 8:10), 10, 0.6)) # 0.1795843 However, that is not what binom.test(..., alternative="two.sided") does: binom.test(2, 10, 0.6, alternative="two.sided") # 0.01834, which is the method of small(er) p-values: sum(dbinom(0:10, 10, 0.6)[dbinom(0:10, 10, 0.6)<=dbinom(2, 10, 0.6)]) # 0.01834117 Thus, question 1) is, is there a reason binom.test is implemented the way it is rather than the other way? (2) I am struggling to understand two-tailed scenarios like this one: - number of trials = 235 - p of success = 1/6 - successes = 51 That is, cases where my logic of taking the successes+1 extreme cases on each tail don't work: adding the point probabilities of 51:235 is fine, but it of course makes no sense to add the point probabilities for 0:185 to that sum(dbinom(51:235, 235, 1/6)) # 0.02654425 sum(dbinom(0:185, 235, 1/6)) # 1 (!) So, while binom.test again does its small(er) p-value thing, ... binom.test(51, 235, 1/6, alternative="two.sided") # 0.04375 sum(dbinom(0:235, 235, 1/6)[dbinom(0:235, 235, 1/6)<=dbinom(51, 235, 1/6)]) # 0.04374797 ... I am wondering how my approach with adding the probabilities of the same number of events from each tail would be done here ...? (3) What is people's view on computing the two-tailed test like this, which leads to an ns result unlike binom.test? 2*sum(dbinom(51:235, 235, 1/6)) # 0.05308849 Any input would be much appreciated! R.Z. ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Two-tailed exact binomial test with binom.test and sum(dbinom(...))

 If your null hypothesis is that the probability of a success is 0.6, i.e. H0: p=0.6, then those > (a) Let's also assume we have an H1 that there are more than 6 > successes > > (b) Now let's assume we have an H1 that there are fewer than 6 > successes > > (1). My understanding would be that, if we have an H1 that says "the > number of successes won't be 6" aren't appropriate alternative hypotheses, because you make a statement about the sample rather than the population. The correct H1 in the two-tailed case is         H1: the probability of success is not 0.6, i.e. p != 0.6 With the H1s you gave above, your implicit null hypothesis is         H0: there will be exactly 6 successes in a sample which you can refute with 100% certainty if you observe != 6 successes in any sample. Perhaps Unit 2 of the SIGIL course, which tries to explain the logic behind the binomial test in detail, might help you get a better understanding of the procedure.  Slides are freely available here:         http://www.stefan-evert.de/SIGIL/sigil_R/Alternatively, read any good introductory statistics textbook that includes the exact binomial test. > (3) What is people's view on computing the two-tailed test like this, > which leads to an ns result unlike binom.test? > 2*sum(dbinom(51:235, 235, 1/6)) # 0.05308849 This is a popular approximation (which I also use most of the time) because it's much less expensive (in computational terms) than computing an exact (likelihood-based) two-tailed p-value as binom.test() does.  This is particularly relevant if you want to compute confidence intervals for the true probability p based on a large sample, which takes ages with binom.test(). Hope this helps, Stefan ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.