Syntactic and sub-lexical features for Turkish discriminative language models

15 years 7 months ago

Download www.cslu.ogi.edu

This paper investigates syntactic and sub-lexical features in Turkish discriminative language models (DLMs). DLM is a featurebased language modeling approach. It reranks the ASR output with discriminatively trained feature parameters. Syntactic information is incorporated into DLM as part-of-speech (PoS) tag n-gram features and head-to-head dependency relations. Sub-lexical units are ﬁrst utilized as language modeling units in the baseline recognizer. Then, sub-lexical features are used to rerank the sub-lexical hypotheses. We explore features, similar to syntactic features, on sub-lexical units to reveal the implicit morpho-syntactic information conveyed by these units. We ﬁnd out that DLM yields more improvement for sub-lexical units than for words. Basic sub-lexical n-gram features result in 0.6% reduction over the baseline and morpho-syntactic features yield an additional 0.4% reduction on the test set.

Ebru Arisoy, Murat Saraclar, Brian Roark, Izhak Sh

Real-time Traffic

ICASSP 2010 | Signal Processing | Sub-lexical Features | Sub-lexical Units | Turkish Discriminative Language |

claim paper

» Discriminative Sentence Compression with Soft Syntactic Evidence

» Perceptron Reranking for CCG Realization

» Latent Mixture of Discriminative Experts for Multimodal Prediction Modeling

» Sparse MultiScale Grammars for Discriminative Latent Variable Parsing

Post Info
More Details (n/a)

Added	06 Dec 2010
Updated	06 Dec 2010
Type	Conference
Year	2010
Where	ICASSP
Authors	Ebru Arisoy, Murat Saraclar, Brian Roark, Izhak Shafran

Comments (0)

Sciweavers

Syntactic and sub-lexical features for Turkish discriminative language models

ICASSP 2010 | Signal Processing | Sub-lexical Features | Sub-lexical Units | Turkish Discriminative Language |

Explore & Download

Productivity Tools

Sciweavers