In this paper, we propose a semi-supervised learning approach for classifying program (bot) generated web search traffic from that of genuine human users. The work is motivated by...
Hongwen Kang, Kuansan Wang, David Soukal, Fritz Be...
In this paper we discuss the management of semi-structured data, i.e., data that has irregular or dynamically changing structure. We describe components of the Stanford Tsimmis Pr...
Joachim Hammer, Jason McHugh, Hector Garcia-Molina
In this paper we present an evaluation of techniques that are designed to encourage web searchers to interact more with the results of a web search. Two specific techniques are ex...
Organizations like the Internet Archive have been capturing Web contents over decades, building up huge repositories of time-versioned pages. The timestamp annotations and the she...
Gerhard Weikum, Nikos Ntarmos, Marc Spaniol, Peter...
The ImpressionRank of a web page (or, more generally, of a web site) is the number of times users viewed the page while browsing search results. ImpressionRank captures the visibi...