Abstract. Spoken audio is an important source of information available to knowledge extraction and management systems. Organization of spoken messages by priority and content can facilitate knowledge capture and decision making based on profiles of recipients as these can be determined by physical and social conditions. This paper revisits the above task and addresses a related data sparseness problem. We propose a methodology according to which the coverage of language models used to categorize message types is augmented with previously unobserved lexical information derived from other corpora. Such lexical information is the result of combining word classes constructed by an agglomerative clustering algorithm which follows a criterion of minimum loss in average mutual information. We subsequently generate more robust category estimators by interpolating class-based and voicemail word-based models.