In this paper, we propose a semi-supervised learning approach for classifying program (bot) generated web search traffic from that of genuine human users. The work is motivated by...
Hongwen Kang, Kuansan Wang, David Soukal, Fritz Be...
Many data-management applications require integrating data from a variety of sources, where different sources may refer to the same real-world entity in different ways and some ma...
Principal components analysis (PCA) is one of the most widely used techniques in machine learning and data mining. Minor components analysis (MCA) is less well known, but can also...
Max Welling, Felix V. Agakov, Christopher K. I. Wi...
In this paper, we have made an effort to propose approaches to find similar legal judgements by extending the popular techniques used in information retrieval and search engines...
Sushanta Kumar, P. Krishna Reddy, V. Balakista Red...
The crawler engines of today cannot reach most of the information contained in the Web. A great amount of valuable information is "hidden" behind the query forms of onli...