highly abstracted. The Chinese writing system uses logographs--conventional representations of words or morphemes. Characters of the most common kind have two parts, one suggesting...
Shannon's Noisy-Channel model, which describes how a corrupted message might be reconstructed, has been the corner stone for much work in statistical language and speech proc...
Recent work on the transfer of semantic information across languages has been recently applied to the development of resources annotated with Frame information for different non-En...
Roberto Basili, Diego De Cao, Danilo Croce, Bonave...
The wide availability of large scale databases requires more efficient and scalable tools for data understanding and knowledge discovery. In this paper, we present a method to ...
Duy-Dinh Le, Shin'ichi Satoh, Michael E. Houle, Da...
This paper presents an unsupervised learning approach to building a non-English (Arabic) stemmer. The stemming model is based on statistical machine translation and it uses an Eng...