: This paper proposes a new inference approach for Chinese probabilistic context-free grammar, which implements the EM algorithm based on the bracket matching schemes. By utilizing the training texts annotated with constituent boundary information and an initial rule set being hybrid with automatically constructed rules and treebank statistics, a linguistically-motivated and broad-coverage Chinese PCFG rule set is generated through this algorithm. Current experimental results show good learning efficiency of this algorithm and high reliability of the generated rule set.
S. J. Young, H.-H. Shih