This paper describes experiments to establish the performance of a named entity recognition system which builds categorized lists of names from manually annotated training data. Names in text are then identi ed using only these lists. This approach does not perform as well as state-of-the-art named entity recognition systems. However, we then show that by using simple ltering techniques for improving the automatically acquired lists, substantial performance bene ts can be achieved, with resulting Fmeasure scores of 87 on a standard test set. These results provide a baseline against which the contribution of more sophisticated supervised learning techniques for NE recognition should be measured.
Mark Stevenson, Robert J. Gaizauskas