Table of contents : 4061ref.pdf......Page 0 Analysis of Incomplete Multivariate Data......Page 1 Contents......Page 3 Preface......Page 9 1.1 Purpose......Page 11 1.2.1 The EM algorithm......Page 13 1.2.2 Markov chain Monte Carlo......Page 14 1.3 Why analysis by simulation?......Page 15 1.4.1 Scope of the rest of this book......Page 17 1.4.3 Software and computational details......Page 18 1.5 Bibliographic notes......Page 19 2.1 The complete-data model......Page 20 2.2.1 Missing at random......Page 21 2.2.2 Distinctness of parameters......Page 22 2.3.1 Observed-data likelihood......Page 23 2.3.2 Examples......Page 25 Definition......Page 30 A bivariate normal example......Page 31 2.4 Examining the ignorability assumption......Page 34 2.4.1 Examples where ignorability is known to hold......Page 35 2.4.2 Examples where ignorability is not known to hold......Page 37 2.5 General ignorable procedures......Page 38 2.5.1 A simulated example......Page 39 2.5.2 Departures from ignorability......Page 42 2.5.3 Notes on nonignorable alternatives......Page 44 2.6 The role of the complete-data model......Page 45 Complex sample surveys......Page 46 The role of imputation......Page 47 2.6.2 Inference treating certain variables as fixed......Page 48 Example: a comparison of two sample means......Page 51 3.2.1 Definition......Page 54 3.2.2 Examples......Page 59 3.2.3 EM for posterior modes......Page 66 3.2.4 Restrictions on the parameter space......Page 67 Example: testing hypothesis for an incomplete 2x2 table......Page 68 3.2.5 The ECM algorithm......Page 71 3.3.1 Stationary values......Page 72 Multiple modes......Page 73 Saddlepoints......Page 74 Likelihood ridges......Page 75 Boundary estimates......Page 76 General comments on the method of maximum likelihood......Page 77 3.3.2 Rate of convergence......Page 78 The missing information principle......Page 80 Missing information and convergence......Page 82 3.3.3 Example......Page 84 Monitoring and detecting convergence......Page 87 Asymptotic covariance matrices from EM......Page 88 Elementwise rates of convergence......Page 90 Accelerating convergence......Page 93 Convergence and prior information......Page 94 Convergence properties of ECM......Page 95 3.4 Markov chain Monte Carlo......Page 96 3.4.1 Gibbs sampling......Page 97 3.4.2 Data augmentation......Page 99 Application to missing-data problems......Page 101 3.4.3 Examples of data augmentation......Page 102 3.4.4 The Metropolis-Hastings algorithm......Page 110 3.4.5 Generalizations and hybrid algorithms......Page 111 3.5.1 The meaning of convergence......Page 112 Nonexistence of a stationary distribution......Page 113 Boundary values and absorbing states......Page 116 3.5.3 Rates of convergence......Page 117 Starting values and starting distributions......Page 120 Difficulties with slow convergence......Page 122 4.1 Introduction......Page 124 4.2.1 Dependent samples......Page 126 Posterior moments......Page 129 Posterior distributions and densities......Page 130 Quantiles......Page 131 Interval estimates......Page 132 Hypothesis tests......Page 133 Beyond scalar quantities......Page 134 4.2.3 Rao-Blackwellized estimates......Page 136 Example:the efficiency of Rao-BlackwellizationRecall......Page 138 4.3 Multiple imputation......Page 143 4.3.1 Bayesianly proper multiple imputations......Page 144 Proper multiple imputations and data augmentation......Page 145 Why only a few imputations are needed......Page 146 Complete-data estimators......Page 147 Rule for combining complete-data inferences......Page 149 Missing information......Page 150 Heuristic justification......Page 151 Further justification......Page 152 Combining point estimates and covariance matrices......Page 153 Combining p-values......Page 156 Combining likelihood-ratio test statistics......Page 157 4.4 Assessing convergence......Page 160 4.4.1 Monitoring convergence in a single chain......Page 161 Time-series plots and autocorrelation......Page 162 Variability of the sample autocorrelation......Page 165 Warnings about time series and autocorrelation......Page 168 4.4.2 Monitoring convergence with parallel chains......Page 169 Overdispersed starting values......Page 170 4.4.3 Choosing scalar functions of the parameter......Page 172 Worst linear function of the parameter......Page 174 Observed-data loglikelihood......Page 175 Methods based on a single chain......Page 176 Methods based on multiple chains......Page 177 Interval estimation for a scalar summary......Page 178 4.5.1 Choosing a method of inference......Page 180 4.5.2 Implementing a parameter-simulation experiment......Page 182 4.5.4 Choosing an imputation model......Page 185 When the analyst assumes more than the imputer......Page 187 When the imputer assumes more than the analyst......Page 188 4.5.5 Further comments on imputation modeling......Page 190 Analyses not based on full parametric models......Page 191 5.1 Introduction......Page 194 5.2.1 Basic notation......Page 195 5.2.2 Bayesian inference under a conjugate prior......Page 197 The inverted-Wishart distribution......Page 198 The normal inverted-Wishart prior and posterior......Page 199 Inferences about the covariance matrix......Page 201 A noninformative prior......Page 202 A ridge prior......Page 204 5.2.4 Alternative parameterizations and sweep......Page 206 The sweep operator......Page 208 5.3 The EM algorithm......Page 212 5.3.1 Preliminary manipulations......Page 213 5.3.2 The E-step......Page 214 Observed and missing parts of the sufficient statistics......Page 216 An implementation in pseudocode......Page 218 Starting values......Page 219 Estimates on the boundary......Page 220 Priors for incomplete data......Page 221 Modications to the M-step......Page 223 5.3.5 Calculating the observed-data loglikelihood......Page 224 5.3.6 Example:serum-cholesterol levels of heart-attack patients......Page 226 5.3.7 Example:changes in heart rate due to marijuana use......Page 230 5.4.1 The I-step......Page 232 5.4.2 The P-step......Page 235 5.4.3 Example:cholesterol levels of heart-attack patients......Page 236 5.4.4 Example:changes in heart rate due to marijuana use......Page 241 6.1 Introduction......Page 246 6.2.2 Generating the imputations......Page 247 6.2.3 Complete-data point and variance estimates......Page 248 6.2.4 Combining the estimates......Page 250 6.2.5 Alternative choices for the number of imputations......Page 251 Advantages of multiple imputation over parameter simulation......Page 253 6.3.1 Predicting achievement in foreign language study......Page 255 6.3.2 Applying the normal model......Page 256 Inestimability of parameters......Page 258 6.3.4 Overcoming the problem of inestimability......Page 260 Inferences for logistic-regression coefficients......Page 262 Joint inferences for groups of coefficients......Page 264 6.4 A simulation study......Page 266 6.4.1 Simulation procedures......Page 267 Imputation......Page 269 Quantiles......Page 270 Correlation coefficients......Page 271 6.4.3 Results......Page 272 6.5.1 Monotone missingness patterns......Page 274 6.5.2 Computing alternative parameterizations......Page 277 Maximum-likelihood estimation......Page 280 Bayesian inference......Page 282 6.5.4 Monotone data augmentation......Page 286 Choosing the monotone pattern to be completed......Page 288 6.5.5 Implementation of the algorithm......Page 289 The I-and P-steps......Page 291 6.5.6 Uses and extensions......Page 296 6.5.7 Example......Page 297 7.1 Introduction......Page 301 7.2.1 The multinomial distribution......Page 302 Maximum-likelihood estimation......Page 305 7.2.2 Collapsing and partitioning the multinomial......Page 306 Factoring the likelihood......Page 310 7.2.3 The Dirichlet distribution......Page 311 Properties of the Dirichlet distribution......Page 313 Relationship to the gamma distribution......Page 314 7.2.4 Bayesian inference......Page 315 Noninformative priors......Page 317 Sparse tables and flattening priors......Page 319 Data-dependent priors......Page 320 7.2.6 Collapsing and partitioning the Dirichlet......Page 322 7.3.1 Characterizing an incomplete categorical dataset......Page 325 Observed-data likelihood......Page 327 The E-and M-steps......Page 329 Starting values and posterior modes......Page 332 Random zeroes and structural zeroes......Page 333 7.3.3 Data augmentation......Page 334 Imputation of unit-level missing data......Page 337 Analysis by parameter simulation......Page 338 Analysis by multiple imputation......Page 342 7.3.5 Example:Protective Services Project for Older Persons......Page 343 7.4.1 Factoring the likelihood and prior density......Page 348 Factoring the prior......Page 351 7.4.2 Monotone data augmentation......Page 353 Interleaving the I-and P-steps......Page 355 7.4.3 Example:driver injury and seatbelt use......Page 357 8.1 Introduction......Page 364 8.2.1 Definition......Page 365 Models for three categorical variables......Page 366 8.2.2 Eliminating associations......Page 368 Hierarchical models......Page 369 8.2.3 Sufficient statistics......Page 370 Correspondence to logit models......Page 372 8.3.1 Maximum-likelihood estimation......Page 374 8.3.2 Iterative proportional fitting......Page 376 Random and structural zeroes......Page 378 An implementation in pseudocode......Page 379 8.3.3 Hypothesis testing and goodness of fit......Page 380 8.3.4 Example:misclassification of seatbelt use and injury......Page 382 8.4.1 Prior distributions for loglinear models......Page 384 The constrained Dirichlet prior......Page 385 8.4.2 Inference using posterior modes......Page 386 Posterior modes and goodness of fit......Page 387 8.4.3 Inference by Bayesian IPF......Page 388 Relationship to conventional IPF......Page 390 An implementation in pseudocode......Page 391 8.4.4 Why Bayesian IPF works......Page 393 The Poisson/gamma representation......Page 394 The cell-means version of Bayesian IPF......Page 396 Heuristic argument for convergence......Page 397 Further notes......Page 400 8.4.5 Example:misclassification of seatbelt use and injury......Page 401 EM for loglinear models......Page 403 The ECM algorithm......Page 404 8.5.2 Goodness-of-fit statistics......Page 406 Adjusted goodness-of-fit statistics......Page 408 8.5.3 Data augmentation and Bayesian IPF......Page 409 8.6.1 Protective Services Project for Older Persons......Page 410 8.6.2 Driver injury and seatbelt use......Page 413 9.1 Introduction......Page 418 9.2.1 Definition......Page 419 9.2.2 Complete-data likelihood......Page 421 Maximum-likelihood estimates......Page 422 9.2.3 Example......Page 423 9.2.4 Complete-data Bayesian inference......Page 424 Inferences for mu and sigma under a noninformative prior......Page 425 Informative priors......Page 426 Loglinear models for the cell probabilities......Page 427 Linear models for the within-cell means......Page 428 Choosing the design matrix......Page 429 Example:Foreign Language Attitude Scale......Page 430 Bayesian inferences for beta and sigma under a noninformative prior......Page 432 Informative priors for beta and sigma......Page 434 Categorical variables completely missing......Page 435 The E-step......Page 439 Evaluating the observed-data loglikelihood......Page 441 The I-step......Page 442 The P-step......Page 443 An ECM algorithm......Page 444 Data augmentation-Bayesian IPF......Page 445 The unrestricted model......Page 448 Restricted models......Page 451 Risk and adverse psychological symptoms......Page 453 Risk and comprehension scores......Page 455 The imputation model......Page 456 Prior distributions......Page 457 Generating the imputations......Page 458 A proportional-odds model......Page 459 Partial correlation coefficients......Page 461 The imputation model......Page 463 A simulation study......Page 465 Further remarks......Page 467 10.2.1 Restricted covariance structures......Page 468 10.2.2 Heavy-tailed distributions......Page 469 10.2.4 Semicontinuous variables......Page 470 10.3 Random-effects models......Page 471 10.4 Models for complex survey data......Page 472 10.6 Mixture models and latent variables......Page 474 10.7 Coarsened data and outlier models......Page 475 10.8 Diagnostics......Page 476 Appendix A: Data Examples......Page 478 Appendix B: Storage of Categorical Data......Page 486 Appendix C: Software......Page 489 References......Page 490 Table of Contents......Page 514