Nowadays, enormous amounts of data are continuously generated not only in massive scale, but also from different, sometimes conflicting, views. Therefore, it is important to conso...
There is an exploding amount of user-generated content on the Web due to the emergence of "Web 2.0" services, such as Blogger, MySpace, Flickr, and del.icio.us. The part...
Ka Cheung Sia, Junghoo Cho, Yun Chi, Belle L. Tsen...
We present a document routing and index partitioning scheme for scalable similarity-based search of documents in a large corpus. We consider the case when similarity-based search ...
Previous efforts on event detection from the web have focused primarily on web content and structure data ignoring the rich collection of web log data. In this paper, we propose t...
Qiankun Zhao, Tie-Yan Liu, Sourav S. Bhowmick, Wei...
Abstract. With the amount of available information on the Web growing rapidly with each day, the need to automatically filter the information in order to ensure greater user effici...
Miha Grcar, Dunja Mladenic, Blaz Fortuna, Marko Gr...