Knowledge work in many fields requires examining several aspects of a collection of documents to attain meaningful understanding that is not explicitly available. Despite recent ad...
When a user is served with a ranked list of relevant documents by the standard document search engines, his search task is usually not over. He has to go through the entire docume...
A meta-search engine propagates user queries to its participant search engines following a server selection strategy. To facilitate server selection, the metasearch engine must ke...
Abstract. In order to organize huge document collections, labeled hierarchical structures are used frequently. Users are most efficient in navigating such hierarchies, if they refl...
Digital Libraries have gained tremendous interest with numerous research projects addressing the wealth of challenges in this field. While computational intelligence systems are ...
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use...
We present a novel system and methodology for browsing and exploring topics and concepts within a document collection. The process begins with the generation of multiple taxonomie...
W. Scott Spangler, Jeffrey T. Kreulen, Justin Less...
Abstract. Term weighting is one of the most important aspects of modern Web retrieval systems. The weight associated with a given term in a document shows the importance of the ter...
This paper addresses the task of extracting opinions from a given document collection. Assuming that an opinion can be represented as a tuple Subject, Attribute, Value , we propose...
An oracle is described for dynamic validation of an application (metadata extraction from scanned documents) where a moderate failure rate is acceptable provided that instances of...
Kurt Maly, Steven J. Zeil, Mohammad Zubair, Ashraf...