Sciweavers

2065 search results - page 388 / 413
» Techniques of Cluster Algorithms in Data Mining
Sort
View
WWW
2010
ACM
14 years 3 months ago
Large-scale bot detection for search engines
In this paper, we propose a semi-supervised learning approach for classifying program (bot) generated web search traffic from that of genuine human users. The work is motivated by...
Hongwen Kang, Kuansan Wang, David Soukal, Fritz Be...
PVLDB
2010
82views more  PVLDB 2010»
13 years 7 months ago
Record Linkage with Uniqueness Constraints and Erroneous Values
Many data-management applications require integrating data from a variety of sources, where different sources may refer to the same real-world entity in different ways and some ma...
Songtao Guo, Xin Dong, Divesh Srivastava, Remi Zaj...
NIPS
2003
13 years 10 months ago
Extreme Components Analysis
Principal components analysis (PCA) is one of the most widely used techniques in machine learning and data mining. Minor components analysis (MCA) is less well known, but can also...
Max Welling, Felix V. Agakov, Christopher K. I. Wi...
COMPUTE
2011
ACM
13 years 6 days ago
Similarity analysis of legal judgments
In this paper, we have made an effort to propose approaches to find similar legal judgements by extending the popular techniques used in information retrieval and search engines...
Sushanta Kumar, P. Krishna Reddy, V. Balakista Red...
DEEC
2007
IEEE
14 years 3 months ago
DeepBot: a focused crawler for accessing hidden web content
The crawler engines of today cannot reach most of the information contained in the Web. A great amount of valuable information is "hidden" behind the query forms of onli...
Manuel Álvarez, Juan Raposo, Alberto Pan, F...