This paper addresses a problem of natural language text alignment, from a humanities discipline called textual genetic criticism where different text versions must be compared. The...
This paper investigates a novel approach to unsupervised morphology induction relying on community detection in networks. In a first step, morphological transformation rules are a...
We present new search algorithms to detect the occurrences of any pattern from a given pattern set in a text, allowing in the occurrences a limited number of spurious text charact...
One of the most important steps in text processing and information retrieval is stemming—reducing of words to stems expressing their base meaning, e.g., bake, baked, bakes, bakin...
Alexander F. Gelbukh, Mikhail Alexandrov, Sang-Yon...
There is no blank to mark word boundaries in Chinese text. As a result, identifying words is difficult, because of segmentation ambiguities and occurrences of unknown words. Conve...