Semantic event recognition based only on vision cues has had limited success on unconstrained still pictures. Metadata related to picture taking provides contextual cues independent of the image content and can be used to improve classification performance. With geotagged pictures, we investigate the novel use of satellite images (e.g., supplied by Google EarthTM ) corresponding to the GPS (Global Positioning System) coordinates to recognize a picture-taking environment. Initial assessment reveals that with minimum training humans can achieve high accuracy in the recognition of terrain environment based on views from above. We propose to employ both color- and structure-based vocabularies for characterizing satellite images in terms of 14 of the most interesting classes, including residential areas, commercial areas, sports venues, parks, and schools. A multiclass AdaBoost engine is trained to predict terrain environment given a satellite aerial image of the location. Initial experime...