Acoustic events produced in meeting-room-like environments may carry information useful for perceptually aware interfaces. In this paper, we focus on the problem of combining different information sources at different structural levels for classifying human vocal-tract non-speech sounds. The Fuzzy Integral (FI) approach is used to fuse outputs of several classification systems, and feature selection and ranking are carried out based on the knowledge extracted from the Fuzzy Measure (FM). In the experiments with a limited set of training data, the FI-based decision-level fusion showed a classification performance which is much higher than the one from the best single classifier and can surpass the performance resulting from the integration at the featurelevel by Support Vector Machines. Although only fusion of audio information sources is considered in this work, the conclusions may be extensible to the multi-modal case.