Record linkage is an important data integration task that has many practical uses for matching, merging and duplicate removal in large and diverse databases. However, a quadratic ...
Timothy de Vries, Hui Ke, Sanjay Chawla, Peter Chr...
Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can p...
Chaitanya Chemudugunta, Padhraic Smyth, Mark Steyv...
Current systems for managing workload on clusters of workstations, particularly those available for Linux-based (Beowulf) clusters, are typically based on traditional process-base...
Daniel Andresen, Nathan Schopf, Ethan Bowker, Timo...
An increasing number of social networking platforms are giving users the option to endorse entities that they find appealing, such as videos, photos, or even other users. We defin...
Web sites allow the collection of vast amounts of navigational data – clickstreams of user traversals through the site. These massive data stores offer the tantalizing possibil...
Kaushik Dutta, Debra E. VanderMeer, Anindya Datta,...