Data generalization is widely used to protect identities and prevent inference of sensitive information during the public release of microdata. The k-anonymity model has been exte...
Many outlier detection methods do not merely provide the decision for a single data object being or not being an outlier but give also an outlier score or “outlier factor” sig...
Leveraging information from relevance assessments has been proposed as an effective means for improving retrieval. We introduce a novel language modeling method which uses inform...
The popularity of Wikipedia and other online knowledge bases has recently produced an interest in the machine learning community for the problem of automatic linking. Automatic hy...
Histogram construction or sequence segmentation is a basic task with applications in database systems, information retrieval, and knowledge management. Its aim is to approximate a...
Tourist photographs constitute a large part of the images uploaded to photo sharing platforms. But filtering methods are needed before one can extract useful knowledge from noisy ...
Adrian Popescu, Gregory Grefenstette, Pierre-Alain...
Web search engines are often presented with user queries that involve comparisons of real-world entities. Thus far, this interaction has typically been captured by users submittin...
Physical structures, for example indexes and materialized views, can improve query execution performance by orders of magnitude. Hence, it is important to choose the right configu...
Iman Elghandour, Ashraf Aboulnaga, Daniel C. Zilio...
Abstract. The extensible markup language XML has become the de facto standard for information representation and interchange on the Internet. XML parsing is a core operation perfor...
Abstract. LONG PAPER. BaseX is an early adopter of the upcoming XQuery Full Text Recommendation. This paper presents some of the enhancements made to the XML database to fully supp...