It is shown that the best ASR results are attained when a pre-processing is carried out synchronically with pitch. Specifically, an analysis step has to be equal to the current one-quasiperiod duration and current analysis intervals have to consist of an entire number of quasiperiods with total 45-60 ms duration. Quasiperiodicity and non-qusiperiodicity models and measures as well as their applications for the optimal segmentation of speech signals into one-quasiperiods are given and discussed. Then the ways to embed these pre-processing results into the recognition procedure are described.
Taras K. Vintsiuk, Mykola M. Sazhok