Sciweavers

359 search results - page 46 / 72
» Document clustering using word clusters via the information ...
Sort
View
CIKM
2011
Springer
12 years 8 months ago
Probabilistic near-duplicate detection using simhash
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Sadhan Sood, Dmitri Loguinov
GBRPR
2007
Springer
14 years 15 days ago
An Efficient Ontology-Based Expert Peering System
Abstract. This paper proposes an expert peering system for information exchange. Our objective is to develop a real-time search engine for an online community where users can ask e...
Tansu Alpcan, Christian Bauckhage, Sachin Agarwal
PAKDD
2009
ACM
263views Data Mining» more  PAKDD 2009»
14 years 3 months ago
Spatial Weighting for Bag-of-Visual-Words and Its Application in Content-Based Image Retrieval
It is a challenging and important task to retrieve images from a large and highly varied image data set based on their visual contents. Problems like how to fill the semantic gap b...
Xin Chen, Xiaohua Hu, Xiajiong Shen
ICDE
1999
IEEE
184views Database» more  ICDE 1999»
14 years 10 months ago
Document Warehousing Based on a Multimedia Database System
Nowadays, structured data such as sales and business forms are stored in data warehouses for decision makers to use. Further, unstructured data such as emails, html texts, images,...
Hiroshi Ishikawa, Kazumi Kubota, Yasuo Noguchi, Ko...
DKE
2006
126views more  DKE 2006»
13 years 8 months ago
FRACTURE mining: Mining frequently and concurrently mutating structures from historical XML documents
In the past few years, the fast proliferation of available XML documents has stimulated a great deal of interest in discovering hidden and nontrivial knowledge from XML repositori...
Ling Chen 0002, Sourav S. Bhowmick, Liang-Tien Chi...