The explosion of user-generated content on the Web has led to new opportunities and significant challenges for companies, that are increasingly concerned about monitoring the disc...
Multi-label problems arise in various domains such as multitopic document categorization and protein function prediction. One natural way to deal with such problems is to construc...
We present an algorithm that learns invariant features from real data in an entirely unsupervised fashion. The principal benefit of our method is that it can be applied without hu...
In this paper, we propose a semi-supervised learning approach for classifying program (bot) generated web search traffic from that of genuine human users. The work is motivated by...
Hongwen Kang, Kuansan Wang, David Soukal, Fritz Be...
Pseudo-relevance feedback (PRF) via query-expansion has been proven to be effective in many information retrieval (IR) tasks. In most existing work, the top-ranked documents from...