Sciweavers

ACL
1998

Beyond N-Grams: Can Linguistic Sophistication Improve Language Modeling?

14 years 1 months ago
Beyond N-Grams: Can Linguistic Sophistication Improve Language Modeling?
It seems obvious that a successful model of natural language would incorporate a great deal of both linguistic and world knowledge. Interestingly, state of the art language models for speech recognition are based on a very crude linguistic model, namely conditioning the probability of a word on a small fixed number of preceding words. Despite many attempts to incorporate more sophisticated information into the models, the n-gram model remains the state of the art, used in virtually all speech recognition systems. In this paper we address the question of whether there is hope in improving language modeling by incorporating more sophisticated linguistic and world knowledge, or whether the ngrams are already capturing the majority of the information that can be employed.
Eric Brill, Radu Florian, John C. Henderson, Lidia
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 1998
Where ACL
Authors Eric Brill, Radu Florian, John C. Henderson, Lidia Mangu
Comments (0)