Abstract. We report on the design of a system for correcting spelling errors resulting in non-existent words. The system aims at improving edition of medical reports. Unlike traditional systems, both semantic and syntactic contexts are considered here. The system is organized along three steps. The first module is based on a context independent string-to-string edit distance calculus. The second module, based on the morpho-syntactic context attempts to rank more relevantly the data set provided by the first module, finally a third contextual module processes words with the same part-of-speech by applying some contextual word-sense disambiguation. Modules 2 and 3 are using both hand written rules and data-driven Markovian matrices. A final evaluation shows a significant improvement compared to context-free spelling correction.
Patrick Ruch, Robert H. Baud, Antoine Geissbü