Beyond N-Grams: Can Linguistic Sophistication Improve Language Modeling?

15 years 1 months ago

Download acl.ldc.upenn.edu

It seems obvious that a successful model of natural language would incorporate a great deal of both linguistic and world knowledge. Interestingly, state of the art language models for speech recognition are based on a very crude linguistic model, namely conditioning the probability of a word on a small fixed number of preceding words. Despite many attempts to incorporate more sophisticated information into the models, the n-gram model remains the state of the art, used in virtually all speech recognition systems. In this paper we address the question of whether there is hope in improving language modeling by incorporating more sophisticated linguistic and world knowledge, or whether the ngrams are already capturing the majority of the information that can be employed.

Eric Brill, Radu Florian, John C. Henderson, Lidia

Real-time Traffic

ACL 1998 | ACL 2007 | Crude Linguistic Model | Speech Recognition | World Knowledge |

claim paper

» Accurate Unlexicalized Parsing

» Towards intelligent QA interfaces discourse processing for context questions

Post Info
More Details (n/a)

Added	01 Nov 2010
Updated	01 Nov 2010
Type	Conference
Year	1998
Where	ACL
Authors	Eric Brill, Radu Florian, John C. Henderson, Lidia Mangu

Comments (0)

Sciweavers

Beyond N-Grams: Can Linguistic Sophistication Improve Language Modeling?

ACL 1998 | ACL 2007 | Crude Linguistic Model | Speech Recognition | World Knowledge |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers