Open source intelligence analysts routinely use the web as a source of information related to their specific taskings. Effective information gathering on the web, despite the prog...
Recent text and speech processing applications such as speech mining raise new and more general problems related to the construction of language models. We present and describe in...
We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down fa...
A hybrid system is described which combines the strength of manual rulewriting and statistical learning, obtaining results superior to both methods if applied separately. The comb...
Jan Hajic, Pavel Krbec, Pavel Kveton, Karel Oliva,...
This paper describes a classifier that assigns semantic thesaurus categories to unknown Chinese words (words not already in the CiLin thesaurus and the Chinese Electronic Dictiona...