Bayesian Kullback Ying—Yang dependence reduction system and theory is presented. Via stochastic approximation, implementable algorithms and criteria are given for parameter learning and model selection, respectively. Three typical architectures are further studied on several special cases. The forward one is a general information theoretic dependence reduction model that maps an observation x into a representation y of k independent components, with k detectable by criteria. For the special cases of invertible map xPy, a general adaptive algorithm is obtained, which not only is applicable to nonlinear or post-nonlinear mixtures, but also provides an adaptive EM algorithm that implements the previously proposed learned parametric mixture method for independent component analysis (ICA) on linear mixtures. The backward architecture provides a maximum likelihood independent factor model for modeling observations from unknown number of independent factors via a linear or nonlinear system...