When building rule-based machine translation systems, a considerable human effort is needed to code the transfer rules that are able to translate source-language sentences into gra...
Question Answering (QA) aims at providing users with short text units that answer specific, well-formed natural language questions. A two stage architecture is widely adopted for t...
Spoken language identification consists in recognizing a language based on a sample of speech from an unknown speaker. The traditional approach for this task mainly considers the p...
Abstract. German compound words pose special problems to statistical machine translation systems: the occurence of each of the components in the training data is not sufficient for...
Naive Bayes classifier is a frequently used method in various natural language processing tasks. Inspired by a modified version of the method called the flexible Bayes classifier, ...
Tapio Pahikkala, Jorma Boberg, Aleksandr Myllä...
Abstract. We introduce a method for content-based advertisement selection for personal blog pages, based on combining multiple representations of the blog. The core idea behind the...
Abstract. Discourse segmentation is the division of a text into minimal discourse segments, which form the leaves in the trees that are used to represent discourse structures. A de...
Abstract. This paper presents a machine learning approach for paraphrase identification which uses lexical and semantic similarity information. In the experimental studies, we exam...
Abstract. Every language employs its own coordination strategies, according to the type of coordinating marking, the pattern of marking, the position of the marker, and the phrase ...