Sciweavers

304 search results - page 17 / 61
» A Semi-Supervised Document Clustering Technique for Informat...
Sort
View
SIGIR
2006
ACM
14 years 1 months ago
Near-duplicate detection by instance-level constrained clustering
For the task of near-duplicated document detection, both traditional fingerprinting techniques used in database community and bag-of-word comparison approaches used in information...
Hui Yang, James P. Callan
HT
2005
ACM
14 years 1 months ago
As we may perceive: inferring logical documents from hypertext
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
Pavel Dmitriev, Carl Lagoze, Boris Suchkov
JCDL
2005
ACM
90views Education» more  JCDL 2005»
14 years 1 months ago
An initial evaluation of automated organization for digital library browsing
In this article we present an evaluation of text clustering and classification methods for creating digital library browse interfaces, focusing on the particular case of collecti...
Aaron Krowne, Martin Halbert
ICPR
2004
IEEE
14 years 8 months ago
Coordinate Systems Reconstruction for Graphical Documents by Hough-feature Clustering and Geometric Analysis
Two-dimensional and three-dimensional coordinate systems are the basic graphics symbols in many graphical documents. A robust coordinate system detection scheme is needed in order...
Chew Lim Tan, Yan Ping Zhou
MM
2004
ACM
195views Multimedia» more  MM 2004»
14 years 29 days ago
Hierarchical clustering of WWW image search results using visual, textual and link information
We consider the problem of clustering Web image search results. Generally, the image search results returned by an image search engine contain multiple topics. Organizing the resu...
Deng Cai, Xiaofei He, Zhiwei Li, Wei-Ying Ma, Ji-R...