Sciweavers

2244 search results - page 129 / 449
» Subjective Document Classification Using Network Analysis
Sort
View
143
Voted
WWW
2008
ACM
16 years 2 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev
CIKM
2008
Springer
15 years 4 months ago
Similarity cross-analysis of tag / co-tag spaces in social classification systems
Recent growth of social classification systems due to steadily increasing popularity has established a multitude of heterogeneous isolated, non-integrated, and non-interoperable t...
Steffen Oldenburg, Martin Garbe, Clemens H. Cap
138
Voted
WWW
2006
ACM
16 years 2 months ago
Large-scale text categorization by batch mode active learning
Large-scale text categorization is an important research topic for Web data mining. One of the challenges in large-scale text categorization is how to reduce the amount of human e...
Steven C. H. Hoi, Rong Jin, Michael R. Lyu
110
Voted
IDEAL
2000
Springer
15 years 5 months ago
Quantization of Continuous Input Variables for Binary Classification
Quantization of continuous variables is important in data analysis, especially for some model classes such as Bayesian networks and decision trees, which use discrete variables. Of...
Michal Skubacz, Jaakko Hollmén
110
Voted
SIGIR
2010
ACM
15 years 6 months ago
Adaptive near-duplicate detection via similarity learning
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz