Aerial video provides strong cues for automatic road extraction that are not available in static aerial images. Using stabilized (or geo-referenced) video data, capturing the distribution of spatio-temporal image derivatives gives a powerful, local representation of the scene variation and motion typical at each pixel. This allows a functional attribution of the scene; a "road" is defined as paths of consistent motion -- a definition which is valid in a large and diverse set of environments. Using a classical relationship between image motion and spatio-temporal image derivatives, road features can be extracted as image regions that have significant image variation and a motion consistent with its neighbors. The video pre-processing to generate image derivative distributions over arbitrarily long sequences is implemented in real time on standard laptops, and the flow field computation and interpretation involves a small number of 3 by 3 matrix operations at each pixel locati...