mwetoolkit: a Framework for Multiword Expression Identification

15 years 8 months ago

Download www.lrec-conf.org

This paper presents the Multiword Expression Toolkit (mwetoolkit), an environment for type and language-independent MWE identification from corpora. The mwetoolkit provides a targeted list of MWE candidates, extracted and filtered according to a number of user-defined criteria and a set of standard statistical association measures. For generating corpus counts, the toolkit provides both a corpus indexation facility and a tool for integration with web search engines, while for evaluation, it provides validation and annotation facilities. The mwetoolkit also allows easy integration with a machine learning tool for the creation and application of supervised MWE extraction models if annotated data is available. In our experiments, the mwetoolkit was tested and evaluated in the context of MWE extraction in the biomedical domain. Our preliminary results show that the toolkit performs better than other approaches, especially concerning recall. Moreover, this first version can be extended in ...

Carlos Ramisch, Aline Villavicencio, Christian Boi

Real-time Traffic

Education | Language-independent Mwe Identification | LREC 2010 | MWE Extraction | MWE Extraction Models |

claim paper

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2010
Where	LREC
Authors	Carlos Ramisch, Aline Villavicencio, Christian Boitet

Comments (0)

Sciweavers

mwetoolkit: a Framework for Multiword Expression Identification

Education | Language-independent Mwe Identification | LREC 2010 | MWE Extraction | MWE Extraction Models |

Explore & Download

Productivity Tools

Sciweavers