For many companies and/or institutions it is no longer sufficient to have a web site and high quality products or services. What in many cases makes the difference between success and failure of e-business is the potential of the respective web site to attract and retain visitors. This potential is determined by a site's content, its design, and technical aspects, such as e.g. time to load the pages among others. In this paper, we concentrate on the content represented by free text of each of the web pages. We propose a method to determine the set of the most important words in a web site from the visitor's point of view. This is done combining usage information with web page content arriving at a set of keywords determined implicitly by the site's visitors. Applying self-organizing neural networks to the respective usage and content data we can identify clusters of typical visitors and the most important pages and words for each cluster. We applied our method to a bank...
Juan D. Velásquez, Richard Weber, Hiroshi Y