The research reported in this paper is the first phase of a larger project on the automatic classification of Web pages by their genres. The long term goal is the incorporation of...
In this paper we present an algorithm for automatic extraction of textual elements, namely titles and full text, associated with news stories in news web pages. We propose a super...
Intrusive Web advertising such as pop-ups and animated layer ads, which distract the user from reading or navigating through the main content of Web pages, is being perceived as a...
The problem of hypertext classification deals with objects possessing more complex information structure than the plain text has. Present hypertext classification systems show the...
This paper explores the use of hierarchical structure for classifying a large, heterogeneous collection of web content. The hierarchical structure is initially used to train diffe...