Abstract--During a music performance, the musician adds expressiveness to the musical message by changing timing, dynamics, and timbre of the musical events to communicate an expressive intention. Traditionally, the analysis of music expression is based on measurements of the deviations of the acoustic parameters with respect to the written score. In this paper, we employ machine learning techniques to understand the expressive communication and to derive audio features at an intermediate level, between music intended as a structured language and notes intended as sound at a more physical level. We start by extracting audio features from expressive performances that were recorded by asking the musicians to perform in order to convey different expressive intentions. We use a sequential forward selection procedure to rank and select a set of features for a general description of the expressions, and a second one specific for each instrument. We show that higher recognition ratings are ac...