In recent years, corpus based approaches to machine translation have become predominant, with Statistical Machine Translation (SMT) being the most actively progressing area. Succe...
Most machine learning algorithms are lazy: they extract from the training set the minimum information needed to predict its labels. Unfortunately, this often leads to models that ...
Joseph O'Sullivan, John Langford, Rich Caruana, Av...
Biological systems consist of many components and interactions between them. In Systems Biology the principal problem is modeling complex biological systems and reconstructing inte...
Marenglen Biba, Stefano Ferilli, Nicola Di Mauro, ...
Abstract. The ability to discover the topic of a large set of text documents using relevant keyphrases is usually regarded as a very tedious task if done by hand. Automatic keyphra...
Khaled M. Hammouda, Diego N. Matute, Mohamed S. Ka...
In this paper, we describe an empirical study of Chinese chunking on a corpus, which is extracted from UPENN Chinese Treebank-4 (CTB4). First, we compare the performance of the st...