One challenge in text processing is the treatment of case insensitive documents such as speech recognition results. The traditional approach is to re-train a language model exclud...
Cheng Niu, Wei Li 0003, Jihong Ding, Rohini K. Sri...
This work applies boosted wrapper induction (BWI), a machine learning algorithm for information extraction from semi-structured documents, to the problem of named entity recogniti...
Today the major web search engines answer queries by showing ten result snippets, which need to be inspected by users for identifying relevant results. In this paper we investigat...
Although the literature contains reports of very high accuracy figures for the recognition of named entities in text, there are still some named entity phenomena that remain probl...
This paper describes experiments to establish the performance of a named entity recognition system which builds categorized lists of names from manually annotated training data. N...