Sciweavers

LREC
2010

Building an Italian FrameNet through Semi-automatic Corpus Analysis

14 years 1 months ago
Building an Italian FrameNet through Semi-automatic Corpus Analysis
In this paper, we outline the methodology we adopted to develop a FrameNet for Italian. The main element of novelty with respect to the original FrameNet is represented by the fact that the creation and annotation of Lexical Units is strictly grounded in distributional information (statistical distribution of verbal subcategorization frames, lexical and semantic preferences of each frame) automatically acquired from a large, dependency-parsed corpus. We claim that this approach allows us to overcome some of the shortcomings of the classical lexicographic method used to create FrameNet, by complementing the accuracy of manual annotation with the robustness of data on the global distributional patterns of a verb. In the paper, we describe our method for extracting distributional data from the corpus and the way we used it for the encoding and annotation of LUs. The long-term goal of our project is to create an electronic lexicon for Italian similar to the original English FrameNet. For ...
Alessandro Lenci, Martina Johnson, Gabriella Lapes
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2010
Where LREC
Authors Alessandro Lenci, Martina Johnson, Gabriella Lapesa
Comments (0)