The goal of entity resolution is to reconcile database references corresponding to the same real-world entities. Given the abundance of publicly available databases where entities...
Indrajit Bhattacharya, Lise Getoor, Louis Licamele
Finding patterns of social interaction within a population has wide-ranging applications including: disease modeling, cultural and information transmission, and behavioral ecology...
The top web search result is crucial for user satisfaction with the web search experience. We argue that the importance of the relevance at the top position necessitates special h...
Privacy preserving data processing has become an important topic recently because of advances in hardware technology which have lead to widespread proliferation of demographic and...
Spatial scan statistics are used to determine hotspots in spatial data, and are widely used in epidemiology and biosurveillance. In recent years, there has been much effort invest...
Deepak Agarwal, Andrew McGregor, Jeff M. Phillips,...
Several algorithms have been proposed to learn to rank entities modeled as feature vectors, based on relevance feedback. However, these algorithms do not model network connections...
Outlier detection can uncover malicious behavior in fields like intrusion detection and fraud analysis. Although there has been a significant amount of work in outlier detection, ...
This work introduces distance-based criteria for segmentation of object trajectories. Segmentation leads to simplification of the original objects into smaller, less complex primi...
Correlation clustering aims at grouping the data set into correlation clusters such that the objects in the same cluster exhibit a certain density and are all associated to a comm...
An effective approach to detect anomalous points in a data set is distance-based outlier detection. This paper describes a simple sampling algorithm to efficiently detect distance...