We introduce the corpus of United States Congressional bills from 1947 to 1998 for use by language research communities. The U.S. Policy Agenda Legislation Corpus Volume 1 (USPALC...
tion Abstract ChengXiang Zhai (Advisor: John Lafferty) Language Technologies Institute School of Computer Science Carnegie Mellon University With the dramatic increase in online in...
This paper presents the Topic-Aspect Model (TAM), a Bayesian mixture model which jointly discovers topics and aspects. We broadly define an aspect of a document as a characteristi...
—In this paper, we propose a novel method for extracting handwritten characters from multi-language document images, which may contain various types of characters, e.g. Chinese, ...
Yonghong Song, Guilin Xiao, Yuanlin Zhang, Lei Yan...
Abstract--This paper is concerned with the automatic recognition of dialogue acts (DAs) in multiparty conversational speech. We present a joint generative model for DA recognition ...