The Web is constantly changing, but most tools used to access Web content deal only with what can be captured at a single instance in time. As a result, Web users may not have a g...
To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...
Agents can personalize otherwise impersonal computational systems. The World Wide Web presents the same appearance to every user regardless of that user’s past activity. Web Bro...
We deal with computational assumptions needed in order to design secure cryptographic schemes. We suggest a classi£cation of such assumptions based on the complexity of falsifying...
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...