Sciweavers

185 search results - page 6 / 37
» Comparing Dimension Reduction Techniques for Document Cluste...
Sort
View
HICSS
2007
IEEE
137views Biometrics» more  HICSS 2007»
14 years 2 months ago
Essential Dimensions of Latent Semantic Indexing (LSI)
Latent Semantic Indexing (LSI) is commonly used to match queries to documents in information retrieval applications. LSI has been shown to improve retrieval performance for some, ...
April Kontostathis
ICPR
2008
IEEE
14 years 2 months ago
A robust technique for text extraction in mixed-type binary documents
A crucial preprocessing stage in applications such as OCR is text extraction from mixed-type documents. The present work, in contrast to most until now, successfully faces the pro...
Charalambos Strouthopoulos, Athanasios Nikolaidis
ICPR
2000
IEEE
14 years 9 months ago
Improved Degraded Document Recognition with Hybrid Modeling Techniques and Character N-Grams
In this paper a robust multifont character recognition system for degraded documents such as photocopy or fax is described. The system is based on Hidden Markov Models (HMMs) usin...
Anja Brakensiek, Daniel Willett, Gerhard Rigoll
ICCV
2003
IEEE
14 years 9 months ago
Mean Shift Based Clustering in High Dimensions: A Texture Classification Example
Feature space analysis is the main module in many computer vision tasks. The most popular technique, k-means clustering, however, has two inherent limitations: the clusters are co...
Bogdan Georgescu, Ilan Shimshoni, Peter Meer
ECIR
2007
Springer
13 years 9 months ago
A Hierarchical Consensus Architecture for Robust Document Clustering
Abstract. A major problem encountered by text clustering practitioners is the difficulty of determining a priori which is the optimal text representation and clustering technique f...
Xavier Sevillano, Germán Cobo, Francesc Al&...