Corpus-based Error Detection in a Multilingual Medical Thesaurus

15 years 8 months ago

Download www.ler.pucpr.br

Cross-language document retrieval systems require support by some kind of multilingual thesaurus for semantically indexing documents in different languages. The peculiarities of the medical sublanguage, together with the subjectivism of lexicographers’ choices, complicates the thesaurus construction process. It furthermore requires a high degree of communication and interaction between the lexicographers involved. In order to detect errors, a systematic procedure is therefore necessary. We here describe a method which supports the maintenance of the multilingual medical subword repository of the MorphoSaurus system which assigns language-independent semantic identifiers to medical texts. Based on the assumption that the distribution of these semantic identifiers should be similar whenever comparing closely related texts in different languages, our approach identifies those semantic identifiers that vary most in distribution comparing language pairs. The revision of these identifiers...

Roosewelt L. Andrade, Edson José Pacheco, P

Real-time Traffic

Healthcare | Language-independent Semantic Identifiers | MEDINFO 2007 | Semantic Identifiers | Thesaurus |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2007
Where	MEDINFO
Authors	Roosewelt L. Andrade, Edson José Pacheco, Píndaro S. Cancian, Percy Nohama, Stefan Schulz

Comments (0)

Sciweavers

Corpus-based Error Detection in a Multilingual Medical Thesaurus

Healthcare | Language-independent Semantic Identifiers | MEDINFO 2007 | Semantic Identifiers | Thesaurus |

Explore & Download

Productivity Tools

Sciweavers