Document summarization plays an increasingly important role with the exponential growth of documents on the Web. Many supervised and unsupervised approaches have been proposed to ...
Liangda Li, Ke Zhou, Gui-Rong Xue, Hongyuan Zha, Y...
We present a browser-extending Semantic Web extraction system that maps HTML documents to tables and, where possible, to rules. First, the basic data extractor ViPER distills and ...
Abstract. A large volume of data with complex structures is currently represented in GML (Geography Markup Language) for storing and exchanging geographic information. As the size ...
Scoring sentences in documents given abstract summaries created by humans is important in extractive multi-document summarization. In this paper, we formulate extractive summariza...
Abstract. Document decomposition is a basic but crucial step for many document related applications. This paper proposes a novel approach to decompose document images into zones. I...