Sciweavers

808 search results - page 85 / 162
» Keyword-based document clustering
Sort
View
EMNLP
2010
13 years 5 months ago
Evaluating Models of Latent Document Semantics in the Presence of OCR Errors
Models of latent document semantics such as the mixture of multinomials model and Latent Dirichlet Allocation have received substantial attention for their ability to discover top...
Daniel David Walker, William B. Lund, Eric K. Ring...
ECIR
2006
Springer
13 years 9 months ago
Automatic Document Organization in a P2P Environment
Abstract. This paper describes an efficient method to construct reliable machine learning applications in peer-to-peer (P2P) networks by building ensemble based meta methods. We co...
Stefan Siersdorfer, Sergej Sizov
JCDL
2003
ACM
160views Education» more  JCDL 2003»
14 years 1 months ago
Automatic Document Metadata Extraction Using Support Vector Machines
Automatic metadata generation provides scalability and usability for digital libraries and their collections. Machine learning methods offer robust and adaptable automatic metadat...
Hui Han, C. Lee Giles, Eren Manavoglu, Hongyuan Zh...
SSWMC
2004
13 years 9 months ago
Show-through watermarking of duplex printed documents
A technique for watermarking duplex printed pages is presented. The technique produces visible watermark patterns like conventional watermarks embedded in paper fabric. Watermark ...
Gaurav Sharma, Shen-ge Wang
DAS
2004
Springer
14 years 1 months ago
Adaptive Region Growing Color Segmentation for Text Using Irregular Pyramid
This paper presents the result of an adaptive region growing segmentation technique for color document images using an irregular pyramid structure. The emphasis is in the segmentat...
Poh Kok Loo, Chew Lim Tan