Named entities (e.g., "Kofi Annan", "Coca-Cola", "Second World War") are ubiquitous in web pages and other types of document and often provide a simplified picture of the document's content. We present an ontology currently containing 31,000 named entities in different languages from various domains such as history, geography, politics, sports, arts, etc., which is being developed at the University of Munich (LMU). The underlying graph data model is simple and yet extremely versatile in different application scenarios. We demonstrate a prototype of a graphical interface to both the ontology and to documents on the web or in a local document repository, with a tight interaction in both directions. Occurrences of concepts from the ontology are highlighted and hyperlinked in the documents. Unrecognized entities could be added to the database and related to other concepts in a semiautomatic process. The entity database can also be used for extending full-...
Felix Weigel, Klaus U. Schulz, Levin Brunner, Edua