In this paper we propose a hierarchical clustering engine, called SnakeT, that is able to organize on-the-fly the search results drawn from 16 commodity search engines into a hier...
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
Centralized Resource Description Framework (RDF) repositories have limitations both in their failure tolerance and in their scalability. Existing Peer-to-Peer (P2P) RDF repositori...
A problem facing many textbook authors (including one of the authors of this paper) is the inevitable delay between new advances in the subject area and their incorporation in a n...
Timothy Miles-Board, Christopher Bailey, Wendy Hal...
We propose two new tools to address the evolution of hyperlinked corpora. First, we define time graphs to extend the traditional notion of an evolving directed graph, capturing li...
Ravi Kumar, Jasmine Novak, Prabhakar Raghavan, And...