parameters, the weight that a triplet is assigned must equal the reciprocal of the probability of that triplet occurring. The triplet will appear with probability wAi−1 from file A and with probability wBj−1 from file B. Thus, the weight that this record should get is the inverse of its probability of occurring, which is 1/(wAi−1+wBj−1). Using these weights assures that every estimate of the form
will be an unbiased estimate. The weights wABj do not necessarily add to n. This may seem a desirable property of the weights, and in that case we define
The most important feature of Rubin’s approach is multiple imputation. Multiple imputation is used to assess the variability of the inference or estimation with respect to the imputation process. The variability can be thought of as having two sources, variability due to choice of imputation model, and variability due to imputation given the imputation model.
Variability due to imputation is addressed by determining the k data points with the k nearest-to-the-fitted values as potential imputations, rather than simply the closest. Then, to create a number of imputed files, one randomly chooses one of the k to match to each record. The variability due to imputation is then measured by alternately using each concatenated file for analysis.
Variability with respect to the imputation model used, here discussed as some sort of regression model, can also be weakly assessed through a type of sensitivity analysis. An essential example of this is the assumption that the partial correlation between Y and Z given X is equal to 0. One could begin by performing several imputations with the assumption that ρYZ.X equals 0. In addition, one could assume that ρYZ.X is equal to, say, .5. Then, rather than regress Y on X and Z on X to determine the nearest-to-the-fitted values, one could regress Y on X and Z, and Z on X and Y, since now the entire covariance matrix of Y, X, Z is specified. Then several imputations could again be performed with this new assumption. The variance due to model selection could then be assessed by comparing the results to those obtained when the assumption that ρYZ.X equals 0 is made.
There is a very close relative to Rubin’s procedure that has the advantage of some computational simplicity. This procedure could be used to shed some light on the sensitivity of the analysis to the failure of the conditional independence assumption. The discussion focuses on the case of unconstrained statistical
Sign in to access your saved publications, downloads, and email preferences.
Former MyNAP users: You'll need to reset your password on your first login to MyAcademies. Click "Forgot password" below to receive a reset link via email. Having trouble? Visit our FAQ page to contact support.
Members of the National Academy of Sciences, National Academy of Engineering, or National Academy of Medicine should log in through their respective Academy portals.
While logged on as a guest, you can download any of our free PDFs on nationalacademies.org . You will remain logged in until you close your browser.
Thank you for creating a MyAcademies account!
Enjoy free access to thousands of National Academies' publications, a 10% discount off every purchase, and build your personal library.
Enter the email address for your MyAcademies (formerly MyNAP) account to receive password reset instructions.
We sent password reset instructions to your email . Follow the link in that email to create a new password. Didn't receive it? Check your spam folder or contact us for assistance.
Your password has been reset.
Verify Your Email Address
We sent a verification link to your email. Please check your inbox (and spam folder) and follow the link to verify your email address. If you did not receive the email, you can request a new verification link below