Integration of multiple heterogeneous data sources continues to be a critical problem for many application domains and a challenge for researchers world-wide. With the increasing ...
We present a framework for clustering distributed data in unsupervised and semi-supervised scenarios, taking into account privacy requirements and communication costs. Rather than...
Often several cooperating parties would like to have a global view of their joint data for various data mining objectives, but cannot reveal the contents of individual records due...
Software readability is a property that influences how easily a given piece of code can be read and understood. Since readability can affect maintainability, quality, etc., prog...
In this paper, we consider the problem of identifying and segmenting topically cohesive regions in the URL tree of a large website. Each page of the website is assumed to have a t...