Sciweavers

IJDAR
2007

Investigation and modeling of the structure of texting language

13 years 11 months ago
Investigation and modeling of the structure of texting language
Language usage over computer mediated discourses, like chats, emails and SMS texts, significantly differs from the standard form of the language. An urge towards shorter message length facilitating faster typing and the need for semantic clarity, shape the structure of this non-standard form known as the texting language. In this work we formally investigate the nature and type of compressions used in SMS texts, and based on the findings develop a word level model for the texting language. For every word in the standard language, we construct a Hidden Markov Model that succinctly represent all possible variations of that word in the texting language along with their associated observation probabilities. The structure of the HMM is novel and arrived at through linguistic analysis of the SMS data. The model parameters have been estimated from a word-aligned SMS and standard English parallel corpus, through machine learning techniques. Preliminary evaluation shows that the word-model ca...
Monojit Choudhury, Rahul Saraf, Vijit Jain, Animes
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2007
Where IJDAR
Authors Monojit Choudhury, Rahul Saraf, Vijit Jain, Animesh Mukherjee, Sudeshna Sarkar, Anupam Basu
Comments (0)