Sciweavers

1679 search results - page 102 / 336
» Evaluation by comparing result sets in context
Sort
View
WWW
2008
ACM
14 years 9 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev
PVLDB
2008
127views more  PVLDB 2008»
13 years 8 months ago
Discovering data quality rules
Dirty data is a serious problem for businesses leading to incorrect decision making, inefficient daily operations, and ultimately wasting both time and money. Dirty data often ari...
Fei Chiang, Renée J. Miller
PAKDD
2009
ACM
96views Data Mining» more  PAKDD 2009»
14 years 3 months ago
Aggregated Subset Mining
The usual data mining setting uses the full amount of data to derive patterns for different purposes. Taking cues from machine learning techniques, we explore ways to divide the d...
Albrecht Zimmermann, Björn Bringmann
IAT
2009
IEEE
14 years 26 days ago
Confusion and Distance Metrics as Performance Criteria for Hierarchical Classification Spaces
When intelligent systems reason about complex problems with a large hierarchical classification space it is hard to evaluate system performance. For classification problems, differ...
Wilbert van Norden, Catholijn M. Jonker
COLING
2010
13 years 4 months ago
Adaptive Development Data Selection for Log-linear Model in Statistical Machine Translation
This paper addresses the problem of dynamic model parameter selection for loglinear model based statistical machine translation (SMT) systems. In this work, we propose a principle...
Mu Li, Yinggong Zhao, Dongdong Zhang, Ming Zhou