— This paper describes graph-based relational, unsupervised learning algorithm to infer node replacement graph grammar and its application to metabolic pathways. We search for frequent subgraphs and then check for overlap among the instances of the subgraphs in the input graph. If subgraphs overlap by one node, we propose a node replacement graph grammar production. We also can infer a hierarchy of productions by compressing portions of a graph described by a production and then inferring new productions on the compressed graph. We show learning curves and how the learning process changes when we increase the size of a sample set. We examine how computation time changes with an increased number of nodes in the input graphs. We inferred graph grammars from metabolic pathways which do not change more with increased number of graphs in the input set. It indicates that graph grammars found represent the input sets well.
Jacek P. Kukluk, Chang Hun You, Lawrence B. Holder