Text clustering is most commonly treated as a fully automated task without user supervision. However, we can improve clustering performance using supervision in the form of pairwi...
How to effectively protect against spam on search ranking results is an important issue for contemporary web search engines. This paper addresses the problem of combating one majo...
Guoyang Shen, Bin Gao, Tie-Yan Liu, Guang Feng, Sh...
The discovery of characteristic rules is a well-known data mining task and has lead to several successful applications. However, because of the descriptive nature of characteristic...
Abstract. We introduce a new framework for feature grouping based on factor graphs, which are graphical models that encode interactions among arbitrary numbers of random variables....
This paper presents an approach to automatically optimizing the retrieval quality of search engines using clickthrough data. Intuitively, a good information retrieval system shoul...