We give an algorithm that learns any monotone Boolean function f : {-1, 1}n {-1, 1} to any constant accuracy, under the uniform distribution, in time polynomial in n and in the decision tree size of f. This is the first algorithm that can learn arbitrary monotone Boolean functions to high accuracy, using random examples only, in time polynomial in a reasonable measure of the complexity of f. A key ingredient of the result is a new bound showing that the average sensitivity of any monotone function computed by a decision tree of size s must be at most log s. This bound has already proved to be of independent utility in the study of decision tree complexity [27]. We generalize the basic inequality and learning result described above in various ways; specifically, to partition size (a stronger complexity measure than decision tree size), p-biased measures over the Boolean cube (rather than just the uniform distribution), and real-valued (rather than just Boolean-valued) functions. Some...
Ryan O'Donnell, Rocco A. Servedio