Retweeting is an important action (behavior) on Twitter, indicating the behavior that users re-post microblogs of their friends. While much work has been conducted for mining text...
Zi Yang, Jingyi Guo, Keke Cai, Jie Tang, Juanzi Li...
Current Information Retrieval systems use inverted index structures for efficient query processing. Due to the extremely large size of many data sets, these index structures are u...
We explore the application of a graph representation to model similarity relationships that exist among images found on the Web. The resulting similarity-induced graph allows us t...
Barbara Poblete, Benjamin Bustos, Marcelo Mendoza,...
In recent years, there has been an explosion of publicly available RDF and OWL data sources. In order to effectively and quickly answer queries in such an environment, we present...
In recent years, a number of projects have turned to Wikipedia to establish large-scale taxonomies that describe orders of magnitude more entities than traditional manually built ...
Current recommendation methods are mainly classified into contentbased, collaborative filtering and hybrid methods. These methods are based on similarity measurements among item...
Yi Cai, Ho-fung Leung, Qing Li, Jie Tang, Juanzi L...
Indexes for large collections are often divided into shards that are distributed across multiple computers and searched in parallel to provide rapid interactive search. Typically,...
The major challenge in mining data streams is the issue of concept drift, the tendency of the underlying data generation process to change over time. In this paper, we propose a g...
Matrix factorization methods are among the most common techniques for detecting latent components in data. Popular examples include the Singular Value Decomposition or Nonnegative...
Christian Thurau, Kristian Kersting, Christian Bau...