The aggregation and comparison of behavioral patterns on the WWW represent a tremendous opportunity for understanding past behaviors and predicting future behaviors. In this paper, we take a first step at achieving this goal. We present a large scale study correlating the behaviors of Internet users on multiple systems ranging in size from 27 million queries to 14 million blog posts to 20,000 news articles. We formalize a model for events in these time-varying datasets and study their correlation. We have created an interface for analyzing the datasets, which includes a novel visual artifact, the DTWRadar, for summarizing differences between time series. Using our tool we identify a number of behavioral properties that allow us to understand the predictive power of patterns of use. Categories and Subject Descriptors H.2.8 [Information Systems] Database Management ? Data Mining, G.3 [Mathematics of Computing] Probability and Statistics ? Time Series Analysis General Terms Algorithms, M...
Eytan Adar, Daniel S. Weld, Brian N. Bershad, Stev