Multi-document discourse analysis has emerged with the potential of improving various NLP applications. Based on the newly proposed Cross-document Structure Theory (CST), this pap...
Abstract. The location of video scenes is an important semantic descriptor especially for broadcast news video. In this paper, we propose a learning-based approach to annotate shot...
This paper explains our developing Corpus of Japanese classroom Lecture speech Contents (henceforth, denoted as CJLC). Increasing e-Learning contents demand a sophisticated intera...
This paper presents the evaluation of the dictionary look-up component of Mayo Clinic's Information Extraction system. The component was tested on a corpus of 160 free-text c...
Karin Schuler, Vinod Kaggal, James J. Masanz, Phil...
Parallel corpora are critical resources for machine translation research and development since parallel corpora contain translation equivalences of various granularities. Manual a...