We have used a general purpose data mining tool to determine whether we can find any ‘golden nuggets’ in the web access logs of a large academic web site. Our goal was to use general purpose data mining algorithms to analyse visitors to the website and somehow characterise or distinguish them in some way. We used two web access logs, one from 2001 and one from 2003. We extracted 4 different feature sets from the web logs and used algorithms for classification (1R, J48/C4.5), clustering (EM), association finding (apriori) and feature selection (correlation based subset evaluation with best first search). We discovered several nuggets, the most significant being that a major difference between visitors from within Australia and visitors from outside Australia is that visitors from outside Australia generally arrive via search engines and are interested in information about postgraduate courses.
Victor Ciesielski, A. Lalani