Sciweavers

460 search results - page 67 / 92
» Detecting Document Genre for Personalization of Information ...
Sort
View
PVLDB
2008
99views more  PVLDB 2008»
13 years 7 months ago
Industry-scale duplicate detection
Duplicate detection is the process of identifying multiple representations of a same real-world object in a data source. Duplicate detection is a problem of critical importance in...
Melanie Weis, Felix Naumann, Ulrich Jehle, Jens Lu...
SIGIR
2004
ACM
14 years 1 months ago
Focused named entity recognition using machine learning
In this paper we study the problem of finding most topical named entities among all entities in a document, which we refer to as focused named entity recognition. We show that th...
Li Zhang, Yue Pan, Tong Zhang
ECIR
2004
Springer
13 years 9 months ago
Identification of Relevant and Novel Sentences Using Reference Corpus
In the novelty task on sentence level, the amount of information used in similarity computation is the major challenging issue. A shallow NLP approach extracts noun and verb featu...
Hsin-Hsi Chen, Ming-Feng Tsai, Ming-Hung Hsu
SIGIR
2006
ACM
14 years 1 months ago
Thread detection in dynamic text message streams
Text message stream is a newly emerging type of Web data which is produced in enormous quantities with the popularity of Instant Messaging and Internet Relay Chat. It is benefici...
Dou Shen, Qiang Yang, Jian-Tao Sun, Zheng Chen
SIGIR
2009
ACM
14 years 2 months ago
Named entity recognition in query
This paper addresses the problem of Named Entity Recognition in Query (NERQ), which involves detection of the named entity in a given query and classification of the named entity...
Jiafeng Guo, Gu Xu, Xueqi Cheng, Hang Li