Using simulated data to develop and study diagnostic tools for data analysis is very beneficial. The user can gain insight about what happens when assumptions are violated since the true model is known. However, care must be taken to be sure that the simulated data is a reasonable representation of what one would usually expect in the real world. This paper discusses the construction of simulated data sets and provides specific examples using linear and logistic regression analysis. It also addresses the execution of simulation based data studies following data construction. 1 WHY USE SIMULATED DATA? The research of analytical techniques through simulation analysis provides benefits that are not possible from research based exclusively on theoretical models. Often assumptions are violated in practice when analyzing real data where the true relationships in the data are unknown. Simulation allows a level of knowledge and control that leads to more robust and defendable solutions. Many ...
Christopher Michael Hill, Linda C. Malone