Production of parallel training corpora for the development of statistical machine translation (SMT) systems for resource-poor languages usually requires extensive manual effort. ...
Many researchers see the need for reject inference in credit scoring models to come from a sample selection problem whereby a missing variable results in omitted variable bias. Al...
Corpus-based methods for natural language processing often use supervised training, requiring expensive manual annotation of training corpora. This paper investigates methods for ...
Sample selection bias is a common problem in many real world applications, where training data are obtained under realistic constraints that make them follow a different distribut...