In this paper, we propose a PLSA-based language model for sports live speech. This model is implemented in unigram rescaling technique that combines a topic model and an n-gram. I...
Spoken Document Retrieval (SDR) is a promising technology for enhancing the utility of spoken materials. After the spoken documents have been transcribed by using a Large Vocabula...
Language modeling is to associate a sequence of words with a priori probability, which is a key part of many natural language applications such as speech recognition and statistic...
Abstract. In a previous work, a new probabilistic context-free grammar (PCFG) model for natural language parsing derived from a tree bank corpus has been introduced. The model esti...
Recently, a novel and structural representation of speech was proposed [1, 2], where the inevitable acoustic variations caused by nonlinguistic factors are effectively removed fro...