This paper describes a new scoring algorithm that supports comparison of linguistically annotated data from noisy sources. The new algorithm generalizes the Message Understanding ...
John D. Burger, David D. Palmer, Lynette Hirschman
In this paper, we analyze the performance of name finding in the context of a variety of automatic speech recognition (ASR) systems and in the context of one optical character rec...
David R. H. Miller, Sean Boisen, Richard M. Schwar...
Truecasing is the process of restoring case information to badly-cased or noncased text. This paper explores truecasing issues and proposes a statistical, language modeling based ...
Lucian Vlad Lita, Abraham Ittycheriah, Salim Rouko...
Two different systems are proposed for the task of capitalisation generation. The first system is a slightly modified speech recogniser. In this system, every word in the vocabula...
Search engines that support structured documents typically support structure created by the author (e.g., title, section), and may also support structure added by an annotation pr...