A web site is a semi structured collection of different kinds of data, whose motivation is show relevant information to visitor and by this way capture her/his attention. Understand the specifics preferences that define the visitor behavior in a web site, is a complex task. An approximation is suppose that it depend the content, navigation sequence and time spent in each page visited. These variables can be extracted from the web log files and the web site itself, using web usage and content mining respectively. Combining the describe variables, a similarity measure among visitor sessions is introduced and used in a clustering algorithm, which identifies groups of similar sessions, allowing the analysis of visitors behavior. In order to prove the methodology’s effectiveness, it was applied in a certain web site, showing the benefits of the described approach.
Juan D. Velásquez, Hiroshi Yasuda, Terumasa