In this paper, we present methods to analyze dialog coherence that help us to automatically distinguish between coherent and incoherent conversations. We build a machine learning classifier using local transition patterns that span over adjacent dialog turns and encode lexical as well as semantic information in dialogs. We evaluate our algorithm on the Switchboard dialog corpus by treating original Switchboard dialogs as our coherent (positive) examples. Incoherent (negative) examples are created by randomly shuffling turns from these Switchboard dialogs. Results are very promising with the accuracy of 89% (over 50% baseline) when incoherent dialogs show both random order as well as random content (topics), and 68% when incoherent dialogs are random ordered but on-topic. We also present experiments on a newspaper text corpus and compare our findings on the two datasets.
Amruta Purandare, Diane J. Litman