Sciweavers

ISDA
2008
IEEE

Compute the Term Contributed Frequency

14 years 6 months ago
Compute the Term Contributed Frequency
In this paper, we propose an algorithm and data structure for computing the term contributed frequency (tcf) for all N-grams in a text corpus. Although term frequency is one of the standard notions of frequency in corpus-based natural language processing (NLP), there are some problems regarding the use of the concept to Ngrams approaches such as the distortion of phrase frequencies. We attempt to overcome this drawback by building a DAG containing the proposed data structure and using it to retrieve more reliable term frequencies. Our proposed algorithm and data structure are more efficient than traditional term frequency extraction approaches and portable to various languages.
Cheng-Lung Sung, Hsu-Chun Yen, Wen-Lian Hsu
Added 31 May 2010
Updated 31 May 2010
Type Conference
Year 2008
Where ISDA
Authors Cheng-Lung Sung, Hsu-Chun Yen, Wen-Lian Hsu
Comments (0)