Sciweavers

467 search results - page 92 / 94
» Phoneme segmentation of speech
Sort
View
CVPR
2012
IEEE
11 years 10 months ago
Example-based cross-modal denoising
Widespread current cameras are part of multisensory systems with an integrated computer (smartphones). Computer vision thus starts evolving to cross-modal sensing, where vision an...
Dana Segev, Yoav Y. Schechner, Michael Elad
MM
2006
ACM
181views Multimedia» more  MM 2006»
14 years 1 months ago
Towards content-based relevance ranking for video search
Most existing web video search engines index videos by file names, URLs, and surrounding texts. These types of video roughly describe the whole video in an abstract level without ...
Wei Lai, Xian-Sheng Hua, Wei-Ying Ma
MM
2005
ACM
146views Multimedia» more  MM 2005»
14 years 1 months ago
Unsupervised content discovery in composite audio
Automatically extracting semantic content from audio streams can be helpful in many multimedia applications. Motivated by the known limitations of traditional supervised approache...
Rui Cai, Lie Lu, Alan Hanjalic
CIKM
2011
Springer
12 years 7 months ago
Focusing on novelty: a crawling strategy to build diverse language models
Word prediction performed by language models has an important role in many tasks as e.g. word sense disambiguation, speech recognition, hand-writing recognition, query spelling an...
Luciano Barbosa, Srinivas Bangalore
CLEAR
2007
Springer
166views Biometrics» more  CLEAR 2007»
14 years 1 months ago
The IBM Rich Transcription 2007 Speech-to-Text Systems for Lecture Meetings
The paper describes the IBM systems submitted to the NIST Rich Transcription 2007 (RT07) evaluation campaign for the speechto-text (STT) and speaker-attributed speech-to-text (SAST...
Jing Huang, Etienne Marcheret, Karthik Visweswaria...