In this article, we introduce a new problem: the construction of multi-structured documents. We first offer an overview of existing solutions to the representation of such docum...
Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can p...
Chaitanya Chemudugunta, Padhraic Smyth, Mark Steyv...
Search engines present fix-length passages from documents ranked by relevance against the query. In this paper, we present and compare novel, language-model based methods for extr...
This paper considers extractive summarization of Chinese spoken documents. In contrast to conventional approaches, we attempt to deal with the extractive summarization problem und...
Recent work in supervised learning of term-based retrieval models has shown significantly improved accuracy can often be achieved via better model estimation [2, 10, 11, 17]. In ...