We present an algorithm, called the offset tree, for learning in situations where a loss associated with different decisions is not known, but was randomly probed. The algorithm is an optimal reduction from this problem to binary classification. In particular, it has regret at most (k - 1) times the regret of the binary classifier it uses, where k is the number of decisions, and no reduction to binary classification can do better. We test the offset tree empirically and discover that it generally results in superior (or equal) performance, compared to several plausible alternative approaches.