This paper addresses the problem of learning object models from egocentric video of household activities, using extremely weak supervision. For each activity sequence, we know onl...
Speaker diarization is the task of partitioning an input stream into speaker homogeneous regions, or in other words, to determine "who spoke when." While approaches to t...
Recent research has seen the proposal of several new inductive principles designed specifically to avoid the problems associated with maximum likelihood learning in models with in...
Benjamin Marlin, Kevin Swersky, Bo Chen, Nando de ...
Abstract. We propose a privacy-preserving formulation of a linear program whose constraint matrix is partitioned into groups of columns where each group of columns and its correspo...
Mismatch in speech bandwidth between training and real operation greatly degrades the performance of automatic speech recognition (ASR) systems. Missing feature technique (MFT) is...