Given a large volume of Web documents, we consider problem of finding the shortest keyword sequences for each of the documents such that a keyword sequence can be rendered to a g...
The World-Wide Web consists not only of informational, but also computational resources. However, these resources, especially computational ones are underutilized. One characteris...
Web search engines like Google have made us all smarter by providing ready access to the world's knowledge whenever we need to look up a fact, learn about a topic or evaluate...
Among the vast numbers of images on the web are many duplicates and near-duplicates, that is, variants derived from the same original image. Such near-duplicates appear in many we...
Jun Jie Foo, Justin Zobel, Ranjan Sinha, Seyed M. ...
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...