Abstract. Emerging electronic text formats include hierarchical structure and visualization related information that current Text-to-Speech (TtS) systems ignore. In this paper we present a novel approach for composing detailed auditory representation of e-texts using speech and audio. Furthermore, we provide a scripting language (CAD scripts) for defining specific customizations on the operation of a TtS. CAD scripts can be assigned as well to specific text meta-data to enable their discrete auditory representation. This approach can form a mean for a detailed exchange of functionality across different TtS implementations. Moreover, it can be hosted to current TtS systems with minor (or major) modifications. Finally, we briefly present the implementation of DEMOSTHeNES Composer for augmented auditory generation of meta-text using the above methodology.