The goal of this research is to infer traits about groups of people from their turn-taking behavior in natural conversation. These traits are latent attributes in a social network, whose relative frequencies we estimate from content-derived metadata. Our approach is to train statistical models of turn-taking behavior using automatic labels of speech activity, and measure the association of these models with socially correlated traits. We experimentally evaluate these ideas using the Switchboard-1 speech corpus, which provides speech content and metadata associated with each speaker, such as gender, age and education, as well as inferred social correlates such as willingness to participate and initiate. We show that population proportions of these socially correlated externals can be predicted with a root meansquared error of approximately 0.1 across all mixture proportions.
John Grothendieck, Allen L. Gorin, Nash M. Borges