We study the problem of anonymizing data with quasi-sensitive attributes. Quasi-sensitive attributes are not sensitive by themselves, but certain values or their combinations may be linked to external knowledge to reveal indirect sensitive information of an individual. We formalize the notion of l-diversity and t-closeness for quasi-sensitive attributes, which we call QS l-diversity and QS t-closeness, to prevent indirect sensitive attribute disclosure. We propose a two-phase anonymization algorithm that combines quasiidentifying value generalization and quasi-sensitive value suppression to achieve QS l-diversity and QS t-closeness. Categories and Subject Descriptors H.2.7 [Database Administration]: Security, integrity, and protection General Terms Algorithms, Design, Experimentation, Security.
Pu Shi, Li Xiong, Benjamin C. M. Fung