DIAC+: a Professional Diacritics Recovering System

15 years 9 months ago

Download www.lrec-conf.org

In languages that use diacritical characters, if these special signs are stripped-off from a word, the resulted string of characters may not exist in the language, and therefore its normative form is, in general, easy to recover. However, this is not always the case, as presence or absence of a diacritical sign attached to a base letter of a word which exists in both variants, may change its grammatical properties or even the meaning, making the recovery of the missing diacritics a difficult task, not only for a program but sometimes even for a human reader. We describe and evaluate an accurate knowledge-based system for automatic recovering the missing diacritics in MSOffice documents written in Romanian. For the rare cases when the system is not able to reliably make a decision, it either provides the user a list of words with their recovery suggestions, or probabilistically choose one of the possible changes, but leaves a trace (a highlighted comment) on each word the modification ...

Dan Tufis, Alexandru Ceausu

Real-time Traffic

Diacritical Characters | Diacritical Sign | Education | LREC 2008 | Normative Form |

claim paper

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	LREC
Authors	Dan Tufis, Alexandru Ceausu

Comments (0)

Sciweavers

DIAC+: a Professional Diacritics Recovering System

Diacritical Characters | Diacritical Sign | Education | LREC 2008 | Normative Form |

Explore & Download

Productivity Tools

Sciweavers