Sciweavers

KDD
1997
ACM

Mining Generalized Term Associations: Count Propagation Algorithm

14 years 3 months ago
Mining Generalized Term Associations: Count Propagation Algorithm
We presenthere an approachand algorithm for mining generalizedterm associations.The problem is to find co-occurrencefrequenciesof terms, given a collection of documents eachwith relevantterms,and a taxonomyof terms. We have developedan efficient Count PropagationAlgorithm (CPA) targetedfor library applicationssuch asMedline. The basis of our approachis that setsof terms (termsets)can be put into a taxonomy. By exploring this taxonomy, CPA propagatesthe count of termsetsto their ancestorsin the taxonomy, insteadof separatelycounting individual termset. We found that CPA is more efficient than other algorithms, particularly for counting large termsets. A benchmarkon data sets extracted from a Medline database showed that CPA outperformsother known algorithms by up to around 200% (half the computing time) at the cost of less than 20% of additional memory to keep the taxonomy of termsets. We haveuseddiscoveredknowledgeof term associationsfor the purposeof improving searchcapability of Med...
Jonghyun Kahng, Wen-Hsiang Kevin Liao, Dennis McLe
Added 08 Aug 2010
Updated 08 Aug 2010
Type Conference
Year 1997
Where KDD
Authors Jonghyun Kahng, Wen-Hsiang Kevin Liao, Dennis McLeod
Comments (0)