Online reviews are an important source for consumers to evaluate products/services on the Internet (e.g. Amazon, Yelp, etc.). However, more and more fraudulent reviewers write fake...
In this paper, the problem of safe exploration in the active learning context is considered. Safe exploration is especially important for data sampling from technical and industria...
Abstract. Factorization machines are a generic framework which allows to mimic many factorization models simply by feature engineering. In this way, they combine the high predictiv...
Suppose we are given a set of databases, such as sales records over different branches. How can we characterise the differences and the norm between these datasets? That is, what a...
Post streams from public social media platforms such as Instagram and Twitter have become precious but noisy data sources to discover what is happening around us. In this paper, we...
Abstract. Clustering validation is a crucial part of choosing a clustering algorithm which performs best for an input data. Internal clustering validation is efficient and realisti...
Learning user/item relation is a key issue in recommender system, and existing methods mostly measure the user/item relation from one particular aspect, e.g., historical ratings, e...
Bin Fu, Guandong Xu, Longbing Cao, Zhihai Wang, Zh...
We introduce the problem of rank matrix factorisation (RMF). That is, we consider the decomposition of a rank matrix, in which each row is a (partial or complete) ranking of all co...
Thanh Le Van, Matthijs van Leeuwen, Siegfried Nijs...
In outlying aspects mining, given a query object, we aim to answer the question as to what features make the query most outlying. The most recent works tackle this problem using tw...
Nguyen Xuan Vinh, Jeffrey Chan, James Bailey, Chri...
Crowdsourcing provides a new way to distribute enormous tasks to a crowd of annotators. The divergent knowledge background and personal preferences of crowd annotators lead to nois...