Traditionally listener response prediction models are learned from pre-recorded dyadic interactions. Because of individual differences in behavior, these recordings do not capture the complete ground truth. Where the recorded listener did not respond to an opportunity provided by the speaker, another listener would have responded or vice versa. In this paper, we introduce the concept of parallel listener consensus where the listener responses from multiple parallel interactions are combined to better capture differences and similarities between individuals. We show how parallel listener consensus can be used for both learning and evaluating probabilistic prediction models of listener responses. To improve the learning performance, the parallel consensus helps identifying better negative samples and reduces outliers in the positive samples. We propose a new error measurement called fConsensus which exploits the parallel consensus to better define the concepts of exactness (mislabels) a...