Sciweavers

241 search results - page 32 / 49
» How Reliable Are the Results of Large-Scale Information Retr...
Sort
View
DIS
2007
Springer
14 years 3 months ago
Unsupervised Spam Detection Based on String Alienness Measures
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
Kazuyuki Narisawa, Hideo Bannai, Kohei Hatano, Mas...
WWW
2006
ACM
14 years 9 months ago
Interactive wrapper generation with minimal user effort
While much of the data on the web is unstructured in nature, there is also a significant amount of embedded structured data, such as product information on e-commerce sites or sto...
Utku Irmak, Torsten Suel
WWW
2007
ACM
14 years 9 months ago
Explorations in the use of semantic web technologies for product information management
Master data refers to core business entities a company uses repeatedly across many business processes and systems (such as lists or hierarchies of customers, suppliers, accounts, ...
Chen Wang, Daniel C. Wolfson, Jean-Sébastie...
SIGIR
2004
ACM
14 years 2 months ago
Block-based web search
Multiple-topic and varying-length of web pages are two negative factors significantly affecting the performance of web search. In this paper, we explore the use of page segmentati...
Deng Cai, Shipeng Yu, Ji-Rong Wen, Wei-Ying Ma
SIGIR
2012
ACM
11 years 11 months ago
Time-sensitive query auto-completion
Query auto-completion (QAC) is a common feature in modern search engines. High quality QAC candidates enhance search experience by saving users time that otherwise would be spent ...
Milad Shokouhi, Kira Radinsky