Segmenting Sentences into Linky Strings Using D-bigram Statistics

15 years 8 months ago

Download acl.ldc.upenn.edu

It is obvious that segmentation takes an important role in natural language processing(NLP), especially for the languages whose sentences are not easily separated into morphemes. In this study we propose a method of segmenting a sentence. The system described in this paper does not use any grammatical information or knowledge in processing. Instead, it uses statistical information drawn from non-tagged corpus of the target language. Most of the segmenting systems are to pick out conventional morphemes which is defined for human use. However, we still do not know whether those conventional morphemes are good units for computational processing. In this paper we explain our system's algorithm and its experimental results on Japanese, though this system is not designed for a particular language. 1 Characteristics of Japanese Text

Shiho Nobesawa, Junya Tsutsumi, Sun Da Jiang, Tomo

Real-time Traffic

COLING 1996 | COLING 2008 | Conventional Morphemes | Knowledge In Processing | Segmenting |

claim paper

Post Info
More Details (n/a)

Added	02 Nov 2010
Updated	02 Nov 2010
Type	Conference
Year	1996
Where	COLING
Authors	Shiho Nobesawa, Junya Tsutsumi, Sun Da Jiang, Tomohisa Sano, Kengo Sato, Masakazu Nakanishi

Comments (0)

Sciweavers

Segmenting Sentences into Linky Strings Using D-bigram Statistics

COLING 1996 | COLING 2008 | Conventional Morphemes | Knowledge In Processing | Segmenting |

Explore & Download

Productivity Tools

Sciweavers