Sciweavers

COLING
2002

Scaled Log Likelihood Ratios for the Detection of Abbreviations in Text Corpora

14 years 8 days ago
Scaled Log Likelihood Ratios for the Detection of Abbreviations in Text Corpora
We describe a language-independent, flexible, and accurate method for the detection of abbreviations in text corpora. It is based on the idea that an abbreviation can be viewed as a collocation, and can be identified by using methods for collocation detection such as the log likelihood ratio. Although the log likelihood ratio is known to show a good recall, its precision is poor. We employ scaling factors which lead to a strong improvement of precision. Experiments with English and German corpora show that abbreviations can be detected with high accuracy.
Tibor Kiss, Jan Strunk
Added 17 Dec 2010
Updated 17 Dec 2010
Type Journal
Year 2002
Where COLING
Authors Tibor Kiss, Jan Strunk
Comments (0)