Sciweavers

COLING
2008

Extending a Thesaurus with Words from Pan-Chinese Sources

14 years 29 days ago
Extending a Thesaurus with Words from Pan-Chinese Sources
In this paper, we work on extending a Chinese thesaurus with words distinctly used in various Chinese communities. The acquisition and classification of such region-specific lexical items is an important step toward the larger goal of constructing a Pan-Chinese lexical resource. In particular, we extend a previous study in three respects: (1) to improve automatic classification by removing duplicated words from the thesaurus, (2) to experiment with classifying words at the subclass level and semantic head level, and (3) to further investigate the possible effects of data heterogeneity between the region-specific words and words in the thesaurus on classification performance. Automatic classification was based on the similarity between a target word and individual categories of words in the thesaurus, measured by the cosine function. Experiments were done on 120 target words from four regions. The automatic classification results were evaluated against a gold standard obtained from hum...
Oi Yee Kwong, Benjamin Ka-Yin T'sou
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where COLING
Authors Oi Yee Kwong, Benjamin Ka-Yin T'sou
Comments (0)