Abstract. The ability to lead collaborative discussions and appropriately scaffold learning has been identified as one of the central advantages of human tutorial interaction [6]. In order to reproduce the effectiveness of human tutors, many developers of tutorial dialogue systems have taken the approach of identifying human tutorial tactics and then incorporating them into their systems. Equally important as understanding the tactics themselves is understanding how human tutors decide which tactics to use. We argue that these decisions are made based not only on student actions and the content of student utterances, but also on the meta-communicative information conveyed through spoken utterances (e.g. pauses, disfluencies, intonation). Since this information is less frequent or unavailable in typed input, tutorial dialogue systems with speech interfaces have the potential to be more effective than those without. This paper gives an overview of the Spoken Conversational Tutor (SCoT) t...