Inducing Gazetteers for Named Entity Recognition by Large-Scale Clustering of Dependency Relations

15 years 8 months ago

Download www.aclweb.org

We propose using large-scale clustering of dependency relations between verbs and multiword nouns (MNs) to construct a gazetteer for named entity recognition (NER). Since dependency relations capture the semantics of MNs well, the MN clusters constructed by using dependency relations should serve as a good gazetteer. However, the high level of computational cost has prevented the use of clustering for constructing gazetteers. We parallelized a clustering algorithm based on expectationmaximization (EM) and thus enabled the construction of large-scale MN clusters. We demonstrated with the IREX dataset for the Japanese NER that using the constructed clusters as a gazetteer (cluster gazetteer) is a effective way of improving the accuracy of NER. Moreover, we demonstrate that the combination of the cluster gazetteer and a gazetteer extracted from Wikipedia, which is also useful for NER, can further improve the accuracy in several cases.

Jun'ichi Kazama, Kentaro Torisawa

Real-time Traffic

ACL 2008 | Cluster Gazetteer | Computational Linguistics | Dependency Relations | MN Clusters |

claim paper

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	ACL
Authors	Jun'ichi Kazama, Kentaro Torisawa

Sciweavers

Inducing Gazetteers for Named Entity Recognition by Large-Scale Clustering of Dependency Relations

ACL 2008 | Cluster Gazetteer | Computational Linguistics | Dependency Relations | MN Clusters |

Explore & Download

Productivity Tools

Sciweavers