Sciweavers

1357 search results - page 212 / 272
» Clustering Orders
Sort
View
WWW
2008
ACM
14 years 9 months ago
Efficient similarity joins for near duplicate detection
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
WWW
2004
ACM
14 years 9 months ago
Graph-based text database for knowledge discovery
While we expect to discover knowledge in the texts available on the Web, such discovery usually requires many complex analysis steps, most of which require different text handling...
Junji Tomita, Hidekazu Nakawatase, Megumi Ishii
KDD
2009
ACM
141views Data Mining» more  KDD 2009»
14 years 8 months ago
Meme-tracking and the dynamics of the news cycle
Tracking new topics, ideas, and "memes" across the Web has been an issue of considerable interest. Recent work has developed methods for tracking topic shifts over long ...
Jure Leskovec, Lars Backstrom, Jon M. Kleinberg
KDD
2008
ACM
243views Data Mining» more  KDD 2008»
14 years 8 months ago
Permu-pattern: discovery of mutable permutation patterns with proximity constraint
Pattern discovery in sequences is an important problem in many applications, especially in computational biology and text mining. However, due to the noisy nature of data, the tra...
Meng Hu, Jiong Yang, Wei Su
HPCA
2001
IEEE
14 years 8 months ago
A New Scalable Directory Architecture for Large-Scale Multiprocessors
The memory overhead introduced by directories constitutes a major hurdle in the scalability of cc-NUMA architectures, which makes the shared-memory paradigm unfeasible for very la...
Manuel E. Acacio, José González, Jos...