Search engine results are usually presented in some form of text summary (e.g., document title, some snippets of the page's content, a URL, etc). Based on the information con...
The first part of the paper provides a brief description of the Language Observatory Project (LOP) and highlights the major technical difficulties to be challenged. The latter par...
Yoshiki Mikami, Pavol Zavarsky, Mohd Zaidi Abd Roz...
We present the design of Dynabot, a guided Deep Web discovery system. Dynabot's modular architecture supports focused crawling of the Deep Web with an emphasis on matching, p...
Daniel Rocco, James Caverlee, Ling Liu, Terence Cr...
This paper presents automatic generation of the Web Ontology Language (OWL) from an UML model. The solution is based on an MDA-defined architecture for ontology development and th...
In this paper we analyze a very large junk e-mail corpus which was generated by a hundred thousand volunteer users of the Hotmail e-mail service. We describe how the corpus is bei...
Geoff Hulten, Joshua T. Goodman, Robert Rounthwait...