Automatic Semantic Sequence Extraction from Unrestricted Non-Tagged Texts

15 years 8 months ago

Download acl.ldc.upenn.edu

Mophological processing, syntactic parsing and other useflfl tools have been proposed in the field of natural language processing(NLP). Many of those NLP tools take dictionary-based approaches. Thus these tools are often not very efficient with texts written in casual wordings or texts which contain maw domain-specific terms, because of the lack of vocabulary. In this paper we propose a simple method to obtain domain-specific sequences from unrestricted texts using statist;ical information only. This method is language-independent. We had experiments oil sequence extraction on email l;exts in Japanese, and succeeded in extracting significant semantic sequences in the test corpus. We tried morphological parsing on the test corpus with ChaSen, a Japanese dictionary-based morphological parser, and examined our system's efficiency in extraction of semantic sequences which were not recognized with ChaSen. Our system detected 69.06% of the unknown words correctly.

Shiho Nobesawa, Hiroaki Saito, Masakazu Nakanishi

Real-time Traffic

COLING 2000 | COLING 2008 | Maw Domain-specific Terms | Semantic Sequences | Test Corpus |

claim paper

» Automatic parsing of American football videos by intermodal collaboration based on transit...

» A Statistical Model for Multilingual Entity Detection and Tracking

» A Database of Narrative Schemas

Post Info
More Details (n/a)

Added	01 Nov 2010
Updated	01 Nov 2010
Type	Conference
Year	2000
Where	COLING
Authors	Shiho Nobesawa, Hiroaki Saito, Masakazu Nakanishi

Comments (0)

Sciweavers

Automatic Semantic Sequence Extraction from Unrestricted Non-Tagged Texts

COLING 2000 | COLING 2008 | Maw Domain-specific Terms | Semantic Sequences | Test Corpus |

Explore & Download

Productivity Tools

Sciweavers