This paper presents experiments on classifying web pages by genre. Firstly, a corpus of 1539 manually labeled web pages was prepared. Secondly, 502 genre features were selected ba...
Abstract. One of the effects of the general Internet growth is an immense number of user accesses to WWW resources. These accesses are recorded in the web server log files, which...
This paper is a July 1999 snapshot of a "whitepaper" that I've been working on. The purpose of the whitepaper, which I initially drafted in April 1999, was to formu...
There has been an increased demand for characterizing user access patterns using web mining techniques since the informative knowledge extracted from web server log files can not ...
Text mining appliesthe sameanalytical functions of datamining to the domainof textual information, relying on sophisticatedtext analysis techniques that distill information from f...