Wikipedia infoboxes is an example of a seemingly structured, yet extraordinarily heterogeneous dataset, where any given record has only a tiny fraction of all possible fields. Su...
We study the scalable management of XML data in P2P networks based on distributed hash tables (DHTs). We identify performance limitations in this context, and propose an array of t...
Ant-based clustering is a nature-inspired technique whereas stochastic agents perform the task of clustering high-dimensional data. This paper analyzes the popular technique of Lum...
Additive clustering was originally developed within cognitive psychology to enable the development of featural models of human mental representation. The representational flexibili...
This paper introduces a new technique of document clustering based on frequent senses. The proposed system, GDClust (Graph-Based Document Clustering) works with frequent senses ra...