We address the feature selection problem for hidden Markov models (HMMs) in sequence classification. Temporal correlation in sequences often causes difficulty in applying feature selection techniques. Inspired by segmental k-means segmentation (SKS) [1], we propose Segmentally Boosted HMMs (SBHMMs), where the stateoptimized features are constructed in a segmental and discriminative manner. The contributions are twofold. First, we introduce a novel feature selection algorithm, where the temporal dynamics are decoupled from the static learning procedure by assuming that the sequential data are piecewise independent and identically distributed. Second, we show that the SBHMM consistently improves traditional HMM recognition in various domains. The reduction of error compared to traditional HMMs ranges from 17% to 70% in American Sign Language recognition, human gait identification, lip reading, and speech recognition.
Pei Yin, Irfan A. Essa, Thad Starner, James M. Reh