Web spam pages use various techniques to achieve higher-than-deserved rankings in a search engine’s results. While human experts can identify spam, it is too expensive to manual...
Most of the current algorithms for finding related pages are exclusively based on text corpora of the WWW or incorporate only authority or hub values of pages. In this paper, we ...
Paul-Alexandru Chirita, Daniel Olmedilla, Wolfgang...
We investigate the criteria used by online searchers when assessing the relevance of web pages to information-seeking tasks. Twenty four searchers were given three tasks each, and...
As the World Wide Web in China grows rapidly, mining knowledge in Chinese Web pages becomes more and more important. Mining Web information usually relies on the machine learning ...
Many web sites contain large sets of pages generated using a common template or layout. For example, Amazon lays out the author, title, comments, etc. in the same way in all its b...