Abstract. The requirements for effective search and management of the WWW are stronger than ever. Currently Web documents are classified based on their content not taking into acco...
Maria Halkidi, Benjamin Nguyen, Iraklis Varlamis, ...
We propose an agent for exploring and categorizing documents on the World Wide Web based on a user pro le. The heart of the agent is an automatic categorization of a set of docume...
Eui-Hong Han, Daniel Boley, Maria L. Gini, Robert ...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
The World Wide Web is a vast source of information accessible to computers, but understandable only to humans. The goal of the research described here is to automatically create a...
Mark Craven, Dan DiPasquo, Dayne Freitag, Andrew M...
This paper presents a novel information system integrating advanced information extraction technology and automatic hyper-linking. Extracted entities are mapped into a domain onto...
Stephan Busemann, Witold Drozdzynski, Hans-Ulrich ...