Sciweavers

CEC
2010
IEEE

Evolving natural language grammars without supervision

14 years 19 days ago
Evolving natural language grammars without supervision
Unsupervised grammar induction is one of the most difficult works of language processing. Its goal is to extract a grammar representing the language structure using texts without annotations of this structure. We have devised an evolutionary algorithm which for each sentence evolves a population of trees that represent different parse trees of that sentence. Each of these trees represent a part of a grammar. The evaluation function takes into account the contexts in which each sequence of Part-Of-Speech tags (POSseq) appears in the training corpus, as well as the frequencies of those POSseqs and contexts. The grammar for the whole training corpus is constructed in an incremental manner. The algorithm has been evaluated using a well known Annotated English corpus, though the annotation have only been used for evaluation purposes. Results indicate that the proposed algorithm is able to improve the results of a classical optimization algorithm, such as EM (Expectation Maximization), for s...
Lourdes Araujo, Jesus Santamaria
Added 08 Nov 2010
Updated 08 Nov 2010
Type Conference
Year 2010
Where CEC
Authors Lourdes Araujo, Jesus Santamaria
Comments (0)