A number of organizations publish microdata for purposes such as public health and demographic research. Although attributes that clearly identify individuals, such as Name and Social Security Number, are generally removed, these databases can sometimes be joined with other public databases on attributes such as Zipcode, Sex, and Birthdate to reidentify individuals who were supposed to remain anonymous. "Joining" attacks are made easier by the availability of other, complementary, databases over the Internet. K-anonymization is a technique that prevents joining attacks by generalizing and/or suppressing portions of the released microdata so that no individual can be uniquely distinguished from a group of size k. In this paper, we provide a practical framework for implementing one type of kanonymization, called full-domain generalization. We introduce a set of algorithms for producing minimal full-domain generalizations, and show that these algorithms perform up to an order o...
Kristen LeFevre, David J. DeWitt, Raghu Ramakrishn