Aiming at the problem when both positive and negative training set are enormous, this paper proposes a novel Matrix-Structural Learning (MSL) method, as an extension to Viola and Jones’ cascade learning method for object detection. Briefly speaking, unlike Viola and Jones’ method that learn linearly by bootstrapping only negative samples, the proposed MSL method bootstraps both positive and negative samples in a matrix-like structure. Moreover, an accumulative way is further presented to improve the training efficiency of MSL by inheriting features learned previously during training procedure. The proposed method is evaluated on face detection problem. On a positive set containing 230,000 face samples, only 12 hours are needed on a common PC with a 3.20GHz Pentium IV processor to learn a classifier with false alarm rate less than 1/1,000,000. What’s more, the accuracy of the learned detector exceeds the state-of-the-art results on the CMU+MIT frontal face test set.