Indexing quality has an overwhelming effect on retrieval effectiveness of search engines. In the past few years it has become one of the major challenges in the search engines are...
Many applications which use web data extract information from a limited number of regions on a web page. As such, web page division into blocks and the subsequent block classifica...
Classification algorithms and document representation approaches are two key elements for a successful document classification system. In the past, much work has been conducted to...
This work aims to provide a page segmentation algorithm which uses both visual and content information to extract the semantic structure of a web page. The visual information is u...
A great challenge for web site designers is how to ensure users' easy access to important web pages efficiently. In this paper we present a clustering-based approach to addres...
Zhong Su, Qiang Yang, HongJiang Zhang, Xiaowei Xu,...