A large proportion of the massive amounts of data that are being collected by many organisations today is about people, and often contains identifying information like names, addresses, dates of birth, or social security numbers. Privacy and confidentiality are of great concern when such data is being processed and analysed, and when there is a need to share such data between organisations or make it publicly available. The research area of data linkage is especially suffering from a lack of publicly available real-world data sets, as experimental evaluations and comparisons are difficult to conduct without real data. In order to overcome this problem, we have developed a data generator that allows flexible creation of synthetic data with realistic characteristics, such as frequency distributions and error probabilities. Our data generator significantly improves similar earlier approaches, and allows the creation of data containing records for individuals, households and families.