Efficient approximate entity extraction with edit distance constraints

16 years 7 months ago

Download www.cse.unsw.edu.au

Named entity recognition aims at extracting named entities from unstructured text. A recent trend of named entity recognition is finding approximate matches in the text with respect to a large dictionary of known entities, as the domain knowledge encoded in the dictionary helps to improve the extraction performance. In this paper, we study the problem of approximate dictionary matching with edit distance constraints. Compared to existing studies using token-based similarity constraints, our problem definition enables us to capture typographical or orthographical errors, both of which are common in entity extraction tasks yet may be missed by token-based similarity constraints. Our problem is technically challenging as existing approaches based on q-gram filtering have poor performance due to the existence of many short entities in the dictionary. Our proposed solution is based on an improved neighborhood generation method employing novel partitioning and prefix pruning techniques. We ...

Wei Wang 0011, Chuan Xiao, Xuemin Lin, Chengqi Zha

Real-time Traffic

Available Named Entity | Database | Entity Recognition | SIGMOD 2009 | Token-based Similarity Constraints |

claim paper

» Exampledriven design of efficient record matching queries

» An Effective TwoStage Model for Exploiting NonLocal Dependencies in Named Entity Recogniti...

» Topology cuts A novel mincutmaxflow algorithm for topology preserving segmentation in ND i...

» Stereo Reconstruction from Multiperspective Panoramas

Post Info
More Details (n/a)

Added	05 Dec 2009
Updated	05 Dec 2009
Type	Conference
Year	2009
Where	SIGMOD
Authors	Wei Wang 0011, Chuan Xiao, Xuemin Lin, Chengqi Zhang

Comments (0)

Sciweavers

Efficient approximate entity extraction with edit distance constraints

Available Named Entity | Database | Entity Recognition | SIGMOD 2009 | Token-based Similarity Constraints |

Explore & Download

Productivity Tools

Sciweavers