Sciweavers

EDBT
2002
ACM

Cut-and-Pick Transactions for Proxy Log Mining

14 years 11 months ago
Cut-and-Pick Transactions for Proxy Log Mining
Web logs collected by proxy servers, referred to as proxy logs or proxy traces, contain information about Web document accesses by many users against many Web sites. This "many-to-many" characteristic poses a challenge to Web log mining techniques due to the difficulty in identifying individual access transactions. This is because in a proxy log, user transactions are not clearly bounded and are sometimes interleaved with each other as well as with noise. Most previous work has used simplistic measures such as a fixed time interval as a determination method for the transaction boundaries, and has not addressed the problem of interleaving and noisy transactions. In this paper, we show that this simplistic view can lead to poor performance in building models to predict future access patterns. We present a more advanced cut-and-pick method for determining the access transactions from proxy logs, by deciding on more reasonable transaction boundaries and by removing noisy accesses...
Wenwu Lou, Guimei Liu, Hongjun Lu, Qiang Yang
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2002
Where EDBT
Authors Wenwu Lou, Guimei Liu, Hongjun Lu, Qiang Yang
Comments (0)