Sciweavers

684 search results - page 10 / 137
» Elimination of Redundant Information for Web Data Mining
Sort
View
ICDM
2006
IEEE
183views Data Mining» more  ICDM 2006»
14 years 1 months ago
Accelerating Newton Optimization for Log-Linear Models through Feature Redundancy
— Log-linear models are widely used for labeling feature vectors and graphical models, typically to estimate robust conditional distributions in presence of a large number of pot...
Arpit Mathur, Soumen Chakrabarti
KDD
2006
ACM
150views Data Mining» more  KDD 2006»
14 years 8 months ago
Maximally informative k-itemsets and their efficient discovery
In this paper we present a new approach to mining binary data. We treat each binary feature (item) as a means of distinguishing two sets of examples. Our interest is in selecting ...
Arno J. Knobbe, Eric K. Y. Ho
PKDD
2007
Springer
120views Data Mining» more  PKDD 2007»
14 years 1 months ago
Site-Independent Template-Block Detection
Detection of template and noise blocks in web pages is an important step in improving the performance of information retrieval and content extraction. Of the many approaches propos...
Aleksander Kolcz, Wen-tau Yih
PAKDD
2010
ACM
179views Data Mining» more  PAKDD 2010»
14 years 9 days ago
Answer Diversification for Complex Question Answering on the Web
We present a novel graph ranking model to extract a diverse set of answers for complex questions via random walks over a negative-edge graph. We assign a negative sign to edge weig...
Palakorn Achananuparp, Xiaohua Hu, Tingting He, Ch...
DAWAK
2003
Springer
14 years 23 days ago
Fighting Redundancy in SQL
Abstract. Many SQL queries with aggregated subqueries exhibit redundancy (overlap in FROM and WHERE clauses). We propose a method, called the for-loop, to optimize such queries by ...
Antonio Badia, Dev Anand