This paper presents region moments, a class of appearance descriptors based on image moments applied to a pool of image features. A careful design of the moments and the image features, makes the descriptors scale and rotation invariant, and therefore suitable for vehicle detection from aerial video, where targets appear at different scales and orientations. Region moments are linearly related to the image features. Thus, comparing descriptors by computing costly geodesic distances and non-linear classifiers can be avoided, because Euclidean geometry and linear classifiers are still effective. The descriptor computation is made efficient by designing a fast procedure based on the integral representation. An extensive comparison between region moments and the region covariance descriptors, reports theoretical, qualitative, and quantitative differences among them, with a clear advantage of the region moments, when used for detecting small image structures, such as vehicles in aerial ...