Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

177

ACL
2008

112views Computational Linguistics» more ACL 2008»

Lexicalized Phonotactic Word Segmentation

15 years 8 months ago

Lexicalized Phonotactic Word Segmentation

Download www.cs.uiuc.edu

This paper presents a new unsupervised algorithm (WordEnds) for inferring word boundaries from transcribed adult conversations. Phone ngrams before and after observed pauses are used to bootstrap a simple discriminative model of boundary marking. This fast algorithm delivers high performance even on morphologically complex words in English and Arabic, and promising results on accurate phonetic transcriptions with extensive pronunciation variation. Expanding training data beyond the traditional miniature datasets pushes performance numbers well above those previously reported. This suggests that WordEnds is a viable model of child language acquisition and might be useful in speech understanding.

Margaret M. Fleck

Real-time Traffic

ACL 2008 | Computational Linguistics | Datasets Pushes Performance | Fast Algorithm Delivers | Simple Discriminative Model |

claim paper

Related Content

» Text Segmentation Using Reiteration and Collocation

» Word Sense Disambiguation and Text Segmentation Based on Lexical Cohesion

» Modelling Lexical Stress

» A phonotacticsemantic paradigm for automatic spoken document classification

» Lexical PostProcessing Optimization for Handwritten Word Recognition

» Lexical and Grammatical Inference

» Segmenting Documents using Multiple Lexical Features

» Compound Noun Segmentation Based on Lexical Data Extracted from Corpus

» Using Collocations for Topic Segmentation and Link Detection

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	ACL
Authors	Margaret M. Fleck

Comments (0)