We investigate the impact of time on the predictability of sentiment classification research for models created from web logs. We show that sentiment classifiers are time dependent and through a series of methodical experiments quantify the size of the dependence. In particular, we measure the accuracies of 25 different time-specific sentiment classifiers on 24 different testing timeframes. We use the Naive Bayes induction technique and the holdout validation technique using equal-sized but separate training and testing data sets. We conducted over 600 experiments and organize our results by the size of the interval (in months) between the training and testing timeframes. Our findings show a significant decrease in accuracy as this interval grows. Using a paired t-test we show classifiers trained on future data and tested on past data significantly outperform classifiers trained on past data and tested on future data. These findings are for a topic-specific corpus created from politic...
Kathleen T. Durant, Michael D. Smith