In this paper, we study search bot traffic from search engine query logs at a large scale. Although bots that generate search traffic aggressively can be easily detected, a large ...
This work addresses the need for stateful dataflow programs that can rapidly sift through huge, evolving data sets. These data-intensive applications perform complex multi-step c...
Dionysios Logothetis, Christopher Olston, Benjamin...
We study the problem of maintaining sketches of recent elements of a data stream. Motivated by applications involving network data, we consider streams that are asynchronous, in w...
Recommendation systems suggest items based on user preferences. Collaborative filtering is a popular approach in which recommending is based on the rating history of the system. O...
An important problem in the area of homeland security is to identify abnormal or suspicious entities in large datasets. Although there are methods from data mining and social netwo...