With the rapid advance of the Internet, a large amount of sensitive data is collected, stored, and processed by different parties. Data mining is a powerful tool that can extract knowledge from large amounts of data. Generally, data mining requires that data be collected into a central site. However, privacy concerns may prevent different parties from sharing their data with others. Cryptography provides extremely powerful tools which enable data sharing while protecting data privacy. In this paper, we briefly survey four recently proposed cryptographic techniques for protecting data privacy in distributed settings. First, we describe a privacy-preserving technique for learning Bayesian networks from a dataset vertically partitioned between two parties. Then, we describe three privacy-preserving data mining techniques in a fully distributed setting where each customer holds a single data record of the database.
Rebecca N. Wright, Zhiqiang Yang, Sheng Zhong