Today there are solutions for some specific turn-taking problems, but no general model. We show how turn-taking can be reduced to two more general problems, prediction and selection. We also discuss the value of predicting not only future speech/silence but also prosodic features, thereby handing not only turn-taking but "turn-shaping". To illustrate how such predictions can be made, we trained a neural network predictor. This was adequate to support some specific turn-taking decisions and was modestly accurate overall.
Nigel G. Ward, Olac Fuentes, Alejandro Vega