Traditionally, information extraction from web tables has focused on small, more or less homogeneous corpora, often based on assumptions about the use of <table> tags. A mul...
Previous work on spatio-temporal analysis of news items and other documents has largely focused on broad categorization of small text collections by region or country. A system fo...
A (directed) network of people connected by ratings or trust scores, and a model for propagating those trust scores, is a fundamental building block in many of today's most s...
Ramanathan V. Guha, Ravi Kumar, Prabhakar Raghavan...
Automatically generated HTML, as produced by WYSIWYG programs, typically contains much repetitive and unnecessary markup. This paper identifies aspects of such HTML that may be al...
Peer-to-peer systems rely on scalable overlay networks that enable efficient routing between its members. Hypercubic topologies facilitate such operations while each node only nee...