Commercial datasets are often large, relational, and dynamic. They contain many records of people, places, things, events and their interactions over time. Such datasets are rarel...
Andrew Fast, Lisa Friedland, Marc Maier, Brian Tay...
Abstract. In this paper we present static and dynamic studies of duplicate and near-duplicate documents in the Web. The static and dynamic studies involve the analysis of similar c...
Many users are familiar with the interesting but limited functionality of Data Detector interfaces like Microsoft's Smart Tags and Google's AutoLink. In this paper we si...
In situations where class labels are known for a part of the objects, a cluster analysis respecting this information, i.e. semi-supervised clustering, can give insight into the cl...
Wireless data broadcast has received a lot of attention from industries and academia in recent years. Access efficiency and energy conservation are two critical performance concer...