Ever since the landmark paper Ramshaw and Marcus (1995), machine learning systems have been used successfully for identifying base phrases (chunks), the bottom constituents of a parse tree. We expand a state-of-the-art chunking algorithm to a bottom-up parser by recursively applying the chunker to its own output. After testing different training configurations we obtain a reasonable parser which is tested against a standard data set. Its performance falls behind that of current state-of-the-art parsers. We give some suggestions for modifications of the parser which may lead to future performance improvements.
Erik F. Tjong Kim Sang