Sciweavers

NAACL
2001

Identifying Cognates by Phonetic and Semantic Similarity

14 years 27 days ago
Identifying Cognates by Phonetic and Semantic Similarity
I present a method of identifying cognates in the vocabularies of related languages. I show that a measure of phonetic similarity based on multivalued features performs better than "orthographic" measures, such as the Longest Common Subsequence Ratio (LCSR) or Dice's coefficient. I introduce a procedure for estimating semantic similarity of glosses that employs keyword selection and WordNet. Tests performed on vocabularies of four Algonquian languages indicate that the method is capable of discovering on average nearly 75% percent of cognates at 50% precision.
Grzegorz Kondrak
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2001
Where NAACL
Authors Grzegorz Kondrak
Comments (0)