In this paper, weexplore the use of machinelearning and data mining to improvethe prediction of travel times in an automobile. Weconsider two formulations of this problem, one that involves predicting speeds at different stages along the route and another that relies on direct prediction of transit time. Wefocus on the second formulation, which weapply to data collected from the San Diego freeway system. Wereport experiments on these data with k-nearest neighbout combinedwith a wrapper to select useful features and normalization parameters. Theresults suggest that 3-nearest neighbour, whenusing information from freeway sensors, substantially outperforms predictions available fromexisting digital maps.Analyses also reveal somesurprises aboutthe usefulnessof other features like the time andday of the trip.
Simon Handley, Pat Langley, Folke A. Rauscher