A hybrid system is described which combines the strength of manual rulewriting and statistical learning, obtaining results superior to both methods if applied separately. The comb...
Jan Hajic, Pavel Krbec, Pavel Kveton, Karel Oliva,...
Text categorization is a well-known task based essentially on statistical approaches using neural networks, Support Vector Machines and other machine learning algorithms. Texts are...
Of the ten million words of contemporary standard Dutch in the Spoken Dutch Corpus (Corpus Gesproken Nederlands, CGN), a selection of one million words of natural spoken language ...
Heleen Hoekstra, Michael Moortgat, Ineke Schuurman...
: In this paper, we address the problem of Part-Of-Speech(POS) tagging of Arabic texts with vowel marks. After the description of the specificities of Arabic language and the induc...
Chiraz Ben Othmane Zribi, Aroua Torjmen, Mohamed B...
This paper presents a three-step approach for interoperabilising heterogeneous semantic resources. Firstly, we construct homogeneous representations of these resources in a pivot ...