Modeling Phone Duration of Lithuanian by Classification and Regression Trees, using Very Large Speech Corpus

15 years 7 months ago

Download www.mii.lt

Classification and regression tree approach was used in this research to model phone duration of Lithuanian. 300 thousand samples of vowels and 400 thousand samples of consonants extracted from VDU-AB20 corpus were used in experimental part of research. Set of 15 parameters characterizing phone and its context were selected for duration prediction. The most significant of them were: identifier (ID) of phone being predicted, adjacent phones IDs and number of phones in syllable. Models were built using two different data sets: one speaker and 20 speakers. The influence of cost complexity pruning and different values of pre pruning were investigated. Prediction by average leaf duration vs. prediction by median leaf duration was also compared. Investigation of most vivid errors was performed, speech rate normalization and trivial noise reduction were applied and influence on models evaluation parameters discussed. The achieved results, correlation 0.8 and 0.75 respectively for vowels and c...

Giedrius Norkevicius, Gailius Raskinis

Real-time Traffic

INFORMATICALT 2006 | INFORMATICALT 2008 | Phone Duration | Phones | Regression Tree Approach |

claim paper

Post Info
More Details (n/a)

Added	27 Dec 2010
Updated	27 Dec 2010
Type	Journal
Year	2008
Where	INFORMATICALT
Authors	Giedrius Norkevicius, Gailius Raskinis

Comments (0)

Sciweavers

Modeling Phone Duration of Lithuanian by Classification and Regression Trees, using Very Large Speech Corpus

INFORMATICALT 2006 | INFORMATICALT 2008 | Phone Duration | Phones | Regression Tree Approach |

Explore & Download

Productivity Tools

Sciweavers