 I have some questions about the use of weights in binomial glm as I am not getting the results I would expect. In my case the weights I have can be seen as 'replicate weights'; one respondent i in my dataset corresponds to w[i] persons in the population. From the documentation of the glm method, I understand that the weights can indeed be used for this: "For a binomial GLM prior weights are used to give the number of trials when the response is the proportion of successes." >From "Modern applied statistics with S-Plus 3rd ed." I understand the same. However, I am getting some strange results. I generated an example: Generate some data which is simular to my dataset > Z <- rbinom(1000, 1, 0.1) > W <- round(rnorm(1000, 100, 40)) > W[W < 1] <- 1 Probability of success can either be estimated using: > sum(Z*W)/sum(W)  0.09642109 Or using glm: > model <- glm(Z ~ 1, weights=W, family=binomial()) Warning message: In glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart,  :   fitted probabilities numerically 0 or 1 occurred > predict(model, type="response")            1 2.220446e-16 These two results are obviously not the same. The strange thing is that when I scale the weights, such that the total equals one, the probability is correctly estimated: > model <- glm(Z ~ 1, weights=W/sum(W), family=binomial()) Warning message: In eval(expr, envir, enclos) : non-integer #successes in a binomial glm! > predict(model, type="response")          1 0.09642109 However scaling of the weights should, as far as I am aware, not have an effect on the estimated parameters. I also tried some other scalings. And, for example scaling the weights by 20 also gives me the correct result. > model <- glm(Z ~ 1, weights=W/20, family=binomial()) Warning message: In eval(expr, envir, enclos) : non-integer #successes in a binomial glm! > predict(model, type="response")          1 0.09642109 Am I misinterpreting the weights? Could this be a numerical problem? Regards, Jan
 Jan, It looks like you did not understand the line "For a binomial GLM prior weights are used to give the number of trials when the response is the proportion of successes." Weights must be a number of trials (hence integer). Not a proportion of a population. Here is an example that clarifies the use of weights. library(boot) library(reshape) dataset <- data.frame(Person = c(rep("A", 20), rep("B", 10)), Success = c(rbinom(20, 1, 0.25), rbinom(10, 1, 0.75))) Aggregated <- cast(Person ~ ., data = dataset, value = "Success", fun = list(mean, length)) m0 <- glm(Success ~ 1, data = dataset, family = binomial) m1 <- glm(mean ~ 1, data = Aggregated, family = binomial, weights = length) inv.logit(coef(m0)) inv.logit(coef(m1)) Have a look at the survey package is you want to analyse stratified data. Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie & Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics & Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [hidden email] www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey   > -----Oorspronkelijk bericht----- > Van: [hidden email] > [mailto:[hidden email]] Namens Jan van der Laan > Verzonden: vrijdag 16 april 2010 14:11 > Aan: [hidden email] > Onderwerp: [R] Weights in binomial glm > > I have some questions about the use of weights in binomial > glm as I am not getting the results I would expect. In my > case the weights I have can be seen as 'replicate weights'; > one respondent i in my dataset corresponds to w[i] persons in > the population. From the documentation of the glm method, I > understand that the weights can indeed be used for this: "For > a binomial GLM prior weights are used to give the number of > trials when the response is the proportion of successes." > >From "Modern applied statistics with S-Plus 3rd ed." I understand the > same.
Re: Weights in binomial glm

