Large corpora are essential to modern methods of computational linguistics and natural language processing. In this paper, we describe an ongoing project whose aim is to build a l...
Abstract. We present a loosely-supervised method for context-free identification of transliterated foreign names and borrowed words in Hebrew text. The method is purely statistical...
Background: The proliferate nature of DNA microarray results have made it necessary to implement a uniform and quick quality control of experimental results to ensure the consiste...
Andreas Petri, Jan Fleckner, Mads Wichmann Matthie...
Previous work on Natural Language Processing for Information Retrieval has shown the inadequateness of semantic and syntactic structures for both document retrieval and categoriza...
Statistical machine translation systems are usually trained on large amounts of bilingual text (used to learn a translation model), and also large amounts of monolingual text in th...