Semistructured data occur in situations where information lacks a homogeneous structure and is incomplete. Yet, up to now the incompleteness of information has not been re ected b...
Abstract We present an active learning framework that predicts the tradeoff between the effort and information gain associated with a candidate image annotation, thereby ranking un...
In this paper, we address the problem of semisupervision in the framework of parametric clustering by using labeled and unlabeled data together. Clustering algorithms can take adv...
Many recent statistical parsers rely on a preprocessing step which uses hand-written, corpus-specific rules to augment the training data with extra information. For example, head-...
We study model selection strategies based on penalized empirical loss minimization. We point out a tight relationship between error estimation and data-based complexity penalizatio...