Sciweavers

304 search results - page 41 / 61
» A Semi-Supervised Document Clustering Technique for Informat...
Sort
View
IDEAS
2009
IEEE
192views Database» more  IDEAS 2009»
14 years 3 months ago
A cluster-based approach to XML similarity joins
A natural consequence of the widespread adoption of XML as standard for information representation and exchange is the redundant storage of large amounts of persistent XML documen...
Leonardo Ribeiro, Theo Härder, Fernanda S. Pi...
CIKM
2011
Springer
12 years 8 months ago
Probabilistic near-duplicate detection using simhash
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Sadhan Sood, Dmitri Loguinov
WWW
2008
ACM
14 years 9 months ago
Exploring social annotations for information retrieval
Social annotation has gained increasing popularity in many Web-based applications, leading to an emerging research area in text analysis and information retrieval. This paper is c...
Ding Zhou, Jiang Bian, Shuyi Zheng, Hongyuan Zha, ...
IJKDB
2010
170views more  IJKDB 2010»
13 years 6 months ago
Clustering Genes Using Heterogeneous Data Sources
Clustering of gene expression data is a standard exploratory technique used to identify closely related genes. Many other sources of data are also likely to be of great assistance...
Erliang Zeng, Chengyong Yang, Tao Li, Giri Narasim...
ICDCS
2002
IEEE
14 years 1 months ago
A Practical Approach for ?Zero? Downtime in an Operational Information System
An Operational Information System (OIS) supports a real-time view of an organization’s information critical to its logistical business operations. A central component of an OIS ...
Ada Gavrilovska, Karsten Schwan, Van Oleson