Geography is becoming increasingly important in web search. Search engines can often return better results to users by analyzing features such as user location or geographic terms...
Qingqing Gan, Josh Attenberg, Alexander Markowetz,...
The great success of Web 2.0 is mainly fuelled by an infrastructure that allows web users to create, share, tag, and connect content and knowledge easily. The tools for developing...
This paper shares our experience in designing a web crawler that can download billions of pages using a single-server implementation and models its performance. We show that with ...
A major difference between corporate intranets and the Internet is that in intranets the barrier for users to create web pages is much higher. This limits the amount and quality o...
Pavel A. Dmitriev, Nadav Eiron, Marcus Fontoura, E...
Background: Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, ...