Sciweavers

3251 search results - page 321 / 651
» Challenges in Web Information Retrieval
Sort
View
CIKM
2005
Springer
15 years 10 months ago
Focused crawling for both topical relevance and quality of medical information
Subject-specific search facilities on health sites are usually built using manual inclusion and exclusion rules. These can be expensive to maintain and often provide incomplete c...
Thanh Tin Tang, David Hawking, Nick Craswell, Kath...
WWW
2008
ACM
16 years 5 months ago
Low-load server crawler: design and evaluation
This paper proposes a method of crawling Web servers connected to the Internet without imposing a high processing load. We are using the crawler for a field survey of the digital ...
Katsuko T. Nakahira, Tetsuya Hoshino, Yoshiki Mika...
140
Voted
DIS
2007
Springer
15 years 11 months ago
Unsupervised Spam Detection Based on String Alienness Measures
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
Kazuyuki Narisawa, Hideo Bannai, Kohei Hatano, Mas...
ITNG
2006
IEEE
15 years 10 months ago
A Study of Self-Organizing Map in Interactive Relevance Feedback
With the vast amount of potential relevant documents on the Web, a key question for a retrieval system is how to achieve a high accuracy retrieval under current Web setting. The w...
Daqing He
ICAPR
2005
Springer
15 years 10 months ago
Combining Text and Link Analysis for Focused Crawling
The number of vertical search engines and portals has rapidly increased over the last years, making the importance of a topic-driven (focused) crawler evident. In this paper, we de...
George Almpanidis, Constantine Kotropoulos