Sciweavers

110 search results - page 14 / 22
» A Comparison of Two Document Clustering Approaches for Clust...
Sort
View
MMM
2011
Springer
368views Multimedia» more  MMM 2011»
12 years 11 months ago
Correlated PLSA for Image Clustering
Probabilistic Latent Semantic Analysis (PLSA) has become a popular topic model for image clustering. However, the traditional PLSA method considers each image (document) independen...
Peng Li, Jian Cheng, Zechao Li, Hanqing Lu
ECCV
2008
Springer
14 years 9 months ago
Learning Visual Shape Lexicon for Document Image Content Recognition
Developing effective content recognition methods for diverse imagery continues to challenge computer vision researchers. We present a new approach for document image content catego...
Guangyu Zhu, Xiaodong Yu, Yi Li, David S. Doermann
WWW
2010
ACM
14 years 2 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
SDM
2003
SIAM
184views Data Mining» more  SDM 2003»
13 years 9 months ago
Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data
The problem of finding clusters in data is challenging when clusters are of widely differing sizes, densities and shapes, and when the data contains large amounts of noise and out...
Levent Ertöz, Michael Steinbach, Vipin Kumar
CORR
2006
Springer
142views Education» more  CORR 2006»
13 years 7 months ago
Exploiting multilingual nomenclatures and language-independent text features as an interlingua for cross-lingual text analysis a
We are proposing a simple, but efficient basic approach for a number of multilingual and cross-lingual language technology applications that are not limited to the usual two or th...
Ralf Steinberger, Bruno Pouliquen, Camelia Ignat