The amount of readily available on-line text has reached hundreds of billions of words and continues to grow. Yet for most core natural language tasks, algorithms continue to be o...
This paper reports first results of an empirical study of the precision of classification rules on an independent test set. We generated a large number of rules using a general co...
The lack of discipline and consistency in gene naming poses a formidable challenge to researchers in locating relevant information sources in the genomics literature. The research...
Mehmet Kayaalp, Alan R. Aronson, Susanne M. Humphr...
We present a noun chunker for German which is based on a head-lexicalised probabilistic contextfl'ee grammar. A manually developed grammar was semi-automatically extended wit...
Statistical machine translation systems are usually trained on large amounts of bilingual text (used to learn a translation model), and also large amounts of monolingual text in th...