Sciweavers

ERCIMDL
2010
Springer

German Encyclopedia Alignment Based on Information Retrieval Techniques

14 years 1 months ago
German Encyclopedia Alignment Based on Information Retrieval Techniques
Collaboratively created online encyclopedias have become increasingly popular. Especially in terms of completeness they have begun to surpass their printed counterparts. Two German publishers of traditional encyclopedias have reacted to this challenge and decided to merge their corpora to create a single more complete encyclopedia. The crucial step in this merge process is the alignment of articles. We have developed a system to identify corresponding entries from different encyclopedic corpora. The base of our system is the alignment algorithm which incorporates various techniques developed in the field of information retrieval. We have evaluated the system on four real-world encyclopedias with a ground truth provided by domain experts. A combination of weighting and ranking techniques has been found to deliver a satisfying performance.
Roman Kern, Michael Granitzer
Added 09 Nov 2010
Updated 09 Nov 2010
Type Conference
Year 2010
Where ERCIMDL
Authors Roman Kern, Michael Granitzer
Comments (0)