Broder et al.’s [3] shingling algorithm and Charikar’s [4] random projection based approach are considered “state-of-theart” algorithms for finding near-duplicate web pag...
Query substitution is an important problem in information retrieval. Much work focuses on how to find substitutes for any given query. In this paper, we study how to efficiently ...
Most queries in web search are ambiguous and multifaceted. Identifying the major senses and facets of queries from search log data, referred to as query subtopic mining in this pa...
Yunhua Hu, Ya-nan Qian, Hang Li, Daxin Jiang, Jian...
Monte-Carlo evaluation consists in estimating a position by averaging the outcome of several random continuations, and can serve as an evaluation function at the leaves of a min-ma...
Abstract. We propose in this paper to use NLP approaches to validate induced syntactic relations. We focus on a Web Validation system, a Semantic Vector-based approach, and finally...