This paper is concerned with automatic extraction of titles from the bodies of HTML documents (web pages). Titles of HTML documents should be correctly defined in the title fields...
Traditional content-based e-mail spam filtering takes into account content of e-mail messages and apply machine learning techniques to infer patterns that discriminate spams from...
Abstract. We consider the problem of finding communities in large linked networks such as web structures or citation networks. We review similarity measures for linked objects and...
This paper proposes the integration of semantic information drawn from a web application’s domain knowledge into all phases of the web usage mining process (preprocessing, patte...
An ever-increasing amount of information on the Web today is available only through search interfaces: the users have to type in a set of keywords in a search form in order to acc...