Hello all,

I am trying to impute some missing data using the mice package. The data

set I am working with contains 125 variables (190 observations),

involving both categorical and continuous data. Some of these variables

are missing up to 30% of their data.

I am running into a peculiar problem which is illustrated by the

following example showing both the original data (blue) and the imputed

values (red).

http://home.simula.no/~harish/files/tmp/imputation-error.pdfAs the plot shows, mice seems to favour 2--3 distinct values for each of

the ten imputations. I would imagine that it would be a bit more

distributed. I observe this behaviour for each of the imputed variables

(~80 variables), at least the ones that I looked at.

I have tried both constructing a predictor matrix (to specify

predictors) and not, allowing mice to figure out sensible defaults. I

have also tried upping the number of iterations per imputation hoping

that would help the algorithm (pmm) converge to a different solution,

but that didn't change the imputations either.

Could you please point me as to where to look to debug this behaviour? I

have been going through the recent mice manual[1], but is there

something in particular I should be looking at? I guess a bigger

question is, should I also be experimenting with other packages such as

Amelia and mi?

Thanks,

Harish

[1]

http://www.stefvanbuuren.nl/publications/MICE%20in%20R%20-%20Draft.pdf______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide

http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.