We present a document routing and index partitioning scheme for scalable similarity-based search of documents in a large corpus. We consider the case when similarity-based search ...
In this paper, we propose a new way to automatically model and predict human behavior of receiving and disseminating information by analyzing the contact and content of personal c...
Xiaodan Song, Ching-Yung Lin, Belle L. Tseng, Ming...
This paper focuses on the problem of identifying influential users of micro-blogging services. Twitter, one of the most notable micro-blogging services, employs a social-networkin...
A wide variety of distortion functions, such as squared Euclidean distance, Mahalanobis distance, Itakura-Saito distance and relative entropy, have been used for clustering. In th...
Arindam Banerjee, Srujana Merugu, Inderjit S. Dhil...
Since the XML format became a de facto standard for structured documents, the IT research and industry have developed a number of XML editors to help users produce structured docu...