This paper describes a new research proposal of multi-document summarization of dynamic content in web pages. Much information is lost in the Web due to the temporal character of w...
This paper focuses on the development of a maintainable information filtering system. The simple and efficient solution to this problem is to block the Web sites by URL, including...
Nowadays, information is primarily searched on the WWW. From a user perspective, the readability is an important criterion for measuring the accessibility and thereby the quality ...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
Comprehensive coverage of the public web is crucial to web search engines. Search engines use crawlers to retrieve pages and then discover new ones by extracting the pages' o...