Abstract--Unlimited vocabulary annotation of multimedia documents remains elusive despite progress solving the problem in the case of a small, fixed lexicon. Taking advantage of th...
Supervised learning from multiple labeling sources is an increasingly important problem in machine learning and data mining. This paper develops a probabilistic approach to this p...
Parallel corpora are critical resources for machine translation research and development since parallel corpora contain translation equivalences of various granularities. Manual a...
Biological research is becoming increasingly complex and data-rich, with multiple public databases providing a variety of resources: hundreds of thousands of substances and interac...
Michael L. Blinov, Oliver Ruebenacker, James C. Sc...
The paper presents a tool assisting manual annotation of linguistic data developed at the Department of Computational linguistics, IBL-BAS. Chooser is a general-purpose modular ap...