The goal of Information Extraction tasks is to identify, categorize, classify, relate, and normalize specific information of interest found in free text, and to make that information available to a back-end data base, data fusion, or other application. A data structure referred to as a templateis typically used for capturing such information, particularly in cases where the amount and complexity of information is substantial. The design of the template for such/m application (or exercise) thus defines the task itself and therefore crucially affects the success of the Information Extraction attempt. This paper discusses template structure and methodological issues which arise in the template design process, within the context of a discussion of the design process itself; this paper is based on the template design process for TIPSTER/MUC5 and certain subsequent Information Extraction exercises. The first section of this paper addresses the issue of selection of the appropriate data repr...
Boyan A. Onyshkevych