We design a class of submodular functions meant for document summarization tasks. These functions each combine two terms, one which encourages the summary to be representative of ...
This paper presents a document restoration technique that is able to flatten curled document images captured through a digital camera. The proposed method corrects camera images of...
—This paper proposes a model-based text line segmentation algorithm for machine-printed document images. The model is based on geometric configuration which uses the interline sp...
—This paper presents a new method for localization of digit strings with a specific syntax in Farsi/ Arabic document images. First, some features are extracted from all connected...
In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...