For the task of near-duplicated document detection, both traditional fingerprinting techniques used in database community and bag-of-word comparison approaches used in information...
In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...
This paper reports on the INRIA group’s approach to XML mining while participating in the INEX XML Mining track 2005. We use a flexible representation of XML documents that allo...
Anne-Marie Vercoustre, Mounir Fegas, Saba Gul, Yve...
Abstract. We present the results of UMBC’s participation in the Web and Novelty tracks. We explored various heuristics-based link analysis approaches to the Topic Distillation ta...
Srikanth Kallurkar, Yongmei Shi, R. Scott Cost, Ch...
— Large-scale mobile ad-hoc networks require flexible and stable clustered network structure for efficient data collection and dissemination. In this paper, a scheme is present...