This paper describes the GE-CMU TIPSTER/SHOGUN system as configured for the TIPSTER 24-month (MUC-5) benchmark, and gives details of the system's performance on the selected Japanese and English texts. The SHOGUN system is a distillation of some of the key ideas that emerged from previous benchmarks and experiments, emphasizing a simple architecture in which the focus is on detailed corpus-based knowledge . This design allowed the project to meet its goal of achieving advances in coverage and accuracy while showing consistently good performanc e across languages and domains.
Paul S. Jacobs, George B. Krupka, Lisa F. Rau, Mic