Sign language (SL) recognition modules in human-computer interaction systems need to be both fast and reliable. In cases where multiple sets of features are extracted from the SL data, the recognition system can speed up processing by taking only a subset of extracted features as its input. However, this should not be realised at the expense of a drop in recognition accuracy. By training different recognizers for different subsets of features, we can formulate the problem as the task of planning the sequence of recognizer actions to apply to SL data, while accounting for the trade-off between recognition speed and accuracy. Partially observable Markov decision processes (POMDPs) provide a principled mathematical framework for such planning problems. A POMDP explicitly models the probabilities of observing various outputs from the individual recognizers and thus maintains a probability distribution (or belief) over the set of possible SL input sentences. It then computes a policy that m...
Sylvie C. W. Ong, David Hsu, Wee Sun Lee, Hanna Ku