Sciweavers

256 search results - page 46 / 52
» A Fully Distributed Framework for Cost-Sensitive Data Mining
Sort
View
KDD
2002
ACM
170views Data Mining» more  KDD 2002»
14 years 8 months ago
Enhanced word clustering for hierarchical text classification
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
ICDM
2009
IEEE
164views Data Mining» more  ICDM 2009»
14 years 2 months ago
iTopicModel: Information Network-Integrated Topic Modeling
—Document networks, i.e., networks associated with text information, are becoming increasingly popular due to the ubiquity of Web documents, blogs, and various kinds of online da...
Yizhou Sun, Jiawei Han, Jing Gao, Yintao Yu
ICDE
2004
IEEE
151views Database» more  ICDE 2004»
14 years 9 months ago
Improved File Synchronization Techniques for Maintaining Large Replicated Collections over Slow Networks
We study the problem of maintaining large replicated collections of files or documents in a distributed environment with limited bandwidth. This problem arises in a number of impo...
Torsten Suel, Patrick Noel, Dimitre Trendafilov
CCGRID
2010
IEEE
13 years 8 months ago
A Map-Reduce System with an Alternate API for Multi-core Environments
Map-reduce framework has received a significant attention and is being used for programming both large-scale clusters and multi-core systems. While the high productivity aspect of ...
Wei Jiang, Vignesh T. Ravi, Gagan Agrawal
DRR
2008
13 years 9 months ago
Whole-book recognition using mutual-entropy-driven model adaptation
We describe an approach to unsupervised high-accuracy recognition of the textual contents of an entire book using fully automatic mutual-entropy-based model adaptation. Given imag...
Pingping Xiu, Henry S. Baird