Abstract. We develop a probabilistic interpretation of non-linear component extraction in neural networks that activate their hidden units according to a softmaxlike mechanism. On the basis of a generative model that combines hidden causes using the max-function, we show how the extraction of input components in such networks can be interpreted as maximum likelihood parameter optimization. A simple and neurally plausible Hebbian Δ-rule is derived. For approximatelyoptimal learning, the activity of the hidden neural units is described by a generalized softmax function and the classical softmax is recovered for very sparse input. We use the bars benchmark test to numerically verify our analytical results and to show competitiveness of the derived learning algorithms.