Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particula...
Daniel M. Dunlavy, Dianne P. O'Leary, John M. Conr...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents (web pages). Titles of HTML documents should be correctly defined in the title fields...
In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific informat...
Lucy Vanderwende, Hisami Suzuki, Chris Brockett, A...
In this article, we elaborate on the meaning of quality in digital libraries (DLs) by proposing a model that is deeply grounded in a formal framework for digital libraries: 5S (St...
Variants of Huffman codes where words are taken as the source symbols are currently the most attractive choices to compress natural language text databases. In particular, Tagged...