Sciweavers

PAKDD
2004
ACM

CMTreeMiner: Mining Both Closed and Maximal Frequent Subtrees

14 years 4 months ago
CMTreeMiner: Mining Both Closed and Maximal Frequent Subtrees
Abstract. Tree structures are used extensively in domains such as computational biology, pattern recognition, XML databases, computer networks, and so on. One important problem in mining databases of trees is to find frequently occurring subtrees. However, because of the combinatorial explosion, the number of frequent subtrees usually grows exponentially with the size of the subtrees. In this paper, we present CMTreeMiner, a computationally efficient algorithm that discovers all closed and maximal frequent subtrees in a database of rooted unordered trees. The algorithm mines both closed and maximal frequent subtrees by traversing an enumeration tree that systematically enumerates all subtrees, while using an enumeration DAG to prune the branches of the enumeration tree that do not correspond to closed or maximal frequent subtrees. The enumeration tree and the enumeration DAG are defined based on a canonical form for rooted unordered trees–the depth-first canonical form (DFCF). We ...
Yun Chi, Yirong Yang, Yi Xia, Richard R. Muntz
Added 02 Jul 2010
Updated 02 Jul 2010
Type Conference
Year 2004
Where PAKDD
Authors Yun Chi, Yirong Yang, Yi Xia, Richard R. Muntz
Comments (0)