Tone-enhanced, generalized character posterior probability (GCPP), a generalized form of posterior probability at subword (Chinese character) level, is proposed as a rescoring metric for improving Cantonese LVCSR performance. The search network is constructed first by converting the original word graph to a restructured word graph, then a character graph and finally, a character confusion network (CCN). Based upon GCPP enhanced with tone information, the character error rate (CER) is minimized or the GCPP product is maximized over a chosen graph. Experimental results show that the tone enhanced GCPP can improve character error rate by up to 15.1%, relatively.
Yao Qian, Frank K. Soong, Tan Lee