We propose a new hierarchical Bayesian n-gram model of natural languages. Our model makes use of a generalization of the commonly used Dirichlet distributions called Pitman-Yor pr...
This paper introduces a novel framework for the accurate retrieval of relational concepts from huge texts. Prior to retrieval, all sentences are annotated with predicate argument ...
This paper investigates a machine learning approach for temporally ordering and anchoring events in natural language texts. To address data sparseness, we used temporal reasoning ...
Inderjeet Mani, Marc Verhagen, Ben Wellner, Chong ...
We consider the task of unsupervised lecture segmentation. We formalize segmentation as a graph-partitioning task that optimizes the normalized cut criterion. Our approach moves b...
Machine Transliteration is to transcribe a word written in a script with approximate phonetic equivalence in another language. It is useful for machine translation, cross-lingual ...
Most of the work on treebank-based statistical parsing exclusively uses the WallStreet-Journal part of the Penn treebank for evaluation purposes. Due to the presence of this quasi...
General information retrieval systems are designed to serve all users without considering individual needs. In this paper, we propose a novel approach to personalized search. It c...
Yuanhua Lv, Le Sun, Junlin Zhang, Jian-Yun Nie, Wa...
This paper presents a new approach based on Equivalent Pseudowords (EPs) to tackle Word Sense Disambiguation (WSD) in Chinese language. EPs are particular artificial ambiguous wor...
An open-domain spoken dialog system has to deal with the challenge of lacking lexical as well as conceptual knowledge. As the real world is constantly changing, it is not possible...