Searching and navigating a Web site is a tedious task and the hierarchical models, such as site maps, are frequently used for organizing the Web site's content. In this work,...
With the page explosion of WWW, how to cover more useful information with limited storage and computation resources becomes more and more important in web IR research. Using web p...
We describe a method to extract tabular data from web pages. Rather than just analyzing the DOM tree, we also exploit visual cues in the rendered version of the document to extrac...
Based on the type of collaborative objects, a collaborative filtering (CF) system falls into one of two categories: item-based CF and user-based CF. Clustering is the basic idea i...
Organizing large document collections for finding information easily and quickly has always been an important user requirement. This paper describes a flexible and powerful dynami...
A software design is often modeled as a collection of unified Modeling Language (UML) diagrams. There are different aspects of the software system that are covered by many differe...
Focused crawlers are considered as a promising way to tackle the scalability problem of topic-oriented or personalized search engines. To design a focused crawler, the choice of s...
Today, the choice for a particular programming language limits the alternative products that can be used to deploy the program. The purpose of this work is to break the strong tie...
A profiling adversary is an adversary whose goal is to classify a population of users into categories according to messages they exchange. This adversary models the most common pr...
Aleksandra Korolova, Ayman Farahat, Philippe Golle