This paper addresses the problem of classification in situations where the data distribution is not homogeneous: Data instances might come from different locations or times, and therefore are sampled from related but different distributions. In particular, features may appear in some parts of the data that are rarely or never seen in others. In most situations with nonhomogeneous data, the training data is not representative of the distribution under which the classifier must operate. We propose a method, based on probabilistic graphical models, for utilizing unseen features during classification. Our method introduces, for each such unseen feature, a continuous hidden variable describing its influence on the class -- whether it tends to be associated with some label. We then use probabilistic inference over the test data to infer a distribution over the value of this hidden variable. Intuitively, we "learn" the role of this unseen feature from the test set, generalizing fro...