Abstract. In this paper, we present initial results towards boosting posterior based speech recognition systems by estimating more informative posteriors using multiple streams of ...
We introduce a direct model for speech recognition that assumes an unstructured, i.e., flat text output. The flat model allows us to model arbitrary attributes and dependences o...
Georg Heigold, Geoffrey Zweig, Xiao Li, Patrick Ng...
Abstract--This work is dedicated to a statistical trajectorybased approach addressing two issues related to dynamic video content understanding: recognition of events and detection...
Alexandre Hervieu, Patrick Bouthemy, Jean-Pierre L...
Phoneme posterior probabilities estimated using Multi-Layer Perceptrons (MLPs) are extensively used both as acoustic scores and features for speech recognition. In this paper we e...
Samuel Thomas, Patrick Nguyen, Geoffrey Zweig, Hyn...
Semantic event recognition based only on vision cues has had limited success on unconstrained still pictures. Metadata related to picture taking provides contextual cues independe...