Random k-nearest-neighbour (RKNN) imputation is an established algorithm for filling in missing values in data sets. Assume that data are missing in a random way, so that missingness is independent of unobserved values (MAR), and assume there is a minimum positive probability of a response vector being complete. Then RKNN, with k equal to the square root of the sample size, asymptotically produces independent values with the correct probability distribution for the ones that are missing. An experiment illustrates two different distance functions for a synthetic data set. © 2006 Elsevier B.V. All rights reserved.
Fredrik A. Dahl