Sciweavers

832 search results - page 48 / 167
» Document clustering with committees
Sort
View
ICDAR
2009
IEEE
14 years 3 months ago
Enhanced Text Extraction from Arabic Degraded Document Images Using EM Algorithm
This paper presents a new enhanced text extraction algorithm from degraded document images on the basis of the probabilistic models. The observed document image is considered as a...
Wafa Boussellaa, Aymen Bougacha, Abderrazak Zahour...
ICDAR
2009
IEEE
14 years 3 months ago
A Self-Adaptive Method for Extraction of Document-Specific Alphabets
Recognition and encoding of digitized historical documents is still a challenging and difficult task. A major problem is the occurrence of unknown glyphs and symbols which might n...
Stefan Pletschacher
CIKM
2009
Springer
14 years 3 months ago
Text summarization model based on the budgeted median problem
We propose a multi-document generic summarization model based on the budgeted median problem. Our model selects sentences to generate a summary so that every sentence in the docum...
Hiroya Takamura, Manabu Okumura
COLING
2008
13 years 10 months ago
A Framework for Identifying Textual Redundancy
The task of identifying redundant information in documents that are generated from multiple sources provides a significant challenge for summarization and QA systems. Traditional ...
Kapil Thadani, Kathleen McKeown
DATESO
2004
84views Database» more  DATESO 2004»
13 years 10 months ago
Query Expansion and Evolution of Topic in Information Retrieval Systems
Approach based on clustering will be described in our paper. Basic version of our system was given in [5] allows us to expand query through special index. Hierarchical agglomerativ...
Jiri Dvorský, Jan Martinovic, Václav...