highly abstracted. The Chinese writing system uses logographs--conventional representations of words or morphemes. Characters of the most common kind have two parts, one suggesting...
In this paper we introduce a statistical Named Entity recognizer (NER) system for the Hungarian language. We examined three methods for identifying and disambiguating proper nouns...
Structured models often achieve excellent performance but can be slow at test time. We investigate structure compilation, where we replace structure with features, which are often...
This paper1 presents an empirical approach to mining parallel corpora. Conventional approaches use a readily available collection of comparable, nonparallel corpora to extract par...
Syntactic machine translation systems currently use word alignments to infer syntactic correspondences between the source and target languages. Instead, we propose an unsupervised...