Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

215

Voted

EMNLP
2009

104views Natural Language Processing» more EMNLP 2009»

Unsupervised morphological segmentation and clustering with document boundaries

15 years 5 months ago

Unsupervised morphological segmentation and clustering with document boundaries

Download www.aclweb.org

Many approaches to unsupervised morphology acquisition incorporate the frequency of character sequences with respect to each other to identify word stems and affixes. This typically involves heuristic search procedures and calibrating multiple arbitrary thresholds. We present a simple approach that uses no thresholds other than those involved in standard application of 2 significance testing. A key part of our approach is using document boundaries to constrain generation of candidate stems and affixes and clustering morphological variants of a given word stem. We evaluate our model on English and the Mayan language Uspanteko; it compares favorably to two benchmark systems which use considerably more complex strategies and rely more on experimentally chosen threshold values.

Taesun Moon, Katrin Erk, Jason Baldridge

Real-time Traffic

EMNLP 2009 | Heuristic Search Procedures | Multiple Arbitrary Thresholds | Natural Language Processing | Unsupervised Morphology Acquisition |

claim paper

Related Content

» A Comparative Evaluation of a New Unsupervised Sentence Boundary Detection Approach on Doc...

» Towards unsupervised wholeobject segmentation Combining automated matting with boundary de...

» Topic Segmentation with Hybrid Document Indexing

» Segmentation of the Liver Using the Deformable Contour Method on CT Images

» Spectral Clustering as a Diagnostic Tool in CrossSectional MR Studies An Application to Mi...

» Combining Morphemebased Machine Translation with Postprocessing Morpheme Prediction

» Automatic unsupervised parameter selection for character segmentation

» Natural Image Segmentation with Adaptive Texture and Boundary Encoding

» Integrating Intensity and Boundary Information for Tissue Classification

Post Info
More Details (n/a)

Added	17 Feb 2011
Updated	17 Feb 2011
Type	Journal
Year	2009
Where	EMNLP
Authors	Taesun Moon, Katrin Erk, Jason Baldridge

Comments (0)