Given a huge online social network, how do we retrieve information from it through crawling? Even better, how do we improve the crawling performance by using parallel crawlers tha...
Duen Horng Chau, Shashank Pandit, Samuel Wang, Chr...
Combining data and code from third-party sources has enabled a new wave of web mashups that add creativity and functionality to web applications. However, browsers are poorly desi...
Personalized web search is a promising way to improve search quality by customizing search results for people with individual information goals. However, users are uncomfortable w...
Determining the user intent of Web searches is a difficult problem due to the sparse data available concerning the searcher. In this paper, we examine a method to determine the us...
Bernard J. Jansen, Danielle L. Booth, Amanda Spink
[Extended Abstract] Ken Wakita Tokyo Institute of Technology 2-12-1 Ookayama, Meguro-ku Tokyo 152-8552, Japan wakita@is.titech.ac.jp Toshiyuki Tsurumi Tokyo Institute of Technolog...
We present an incremental algorithm for building a neighborhood graph from a set of documents. This algorithm is based on a population of artificial agents that imitate the way re...
Semantic similarity measures play important roles in information retrieval and Natural Language Processing. Previous work in semantic web-related applications such as community mi...
Finding relationships between entities on the Web, e.g., the connections between different places or the commonalities of people, is a novel and challenging problem. Existing Web ...
The debate within the Web community over the optimal means by which to organize information often pits formalized classifications against distributed collaborative tagging systems...
In this paper, we describe a capture-recapture experiment conducted on Google's and MSN's cached directories. The anticipated outcome of this work was to monitor evoluti...