Sciweavers

2244 search results - page 129 / 449
» Subjective Document Classification Using Network Analysis
Sort
View
WWW
2008
ACM
14 years 10 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev
CIKM
2008
Springer
13 years 11 months ago
Similarity cross-analysis of tag / co-tag spaces in social classification systems
Recent growth of social classification systems due to steadily increasing popularity has established a multitude of heterogeneous isolated, non-integrated, and non-interoperable t...
Steffen Oldenburg, Martin Garbe, Clemens H. Cap
WWW
2006
ACM
14 years 10 months ago
Large-scale text categorization by batch mode active learning
Large-scale text categorization is an important research topic for Web data mining. One of the challenges in large-scale text categorization is how to reduce the amount of human e...
Steven C. H. Hoi, Rong Jin, Michael R. Lyu
IDEAL
2000
Springer
14 years 25 days ago
Quantization of Continuous Input Variables for Binary Classification
Quantization of continuous variables is important in data analysis, especially for some model classes such as Bayesian networks and decision trees, which use discrete variables. Of...
Michal Skubacz, Jaakko Hollmén
SIGIR
2010
ACM
14 years 1 months ago
Adaptive near-duplicate detection via similarity learning
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz