: Rich, structured annotations of video recordings enable interesting uses, but existing techniques for manual, and even semi-automated, tagging can be too time-consuming. We present in this paper the ContextCam, a prototype of a consumer video camera that provides point of capture annotation of time, location, person presence and event information associated to recorded video. Both low- and high-level metadata are discovered via a variety of sensing and active tagging techniques, as well as through the application of machine learning techniques that use past annotations to suggest metadata for the current recordings. Furthermore, the ContextCam provides users with a minimally intrusive interface for correcting predicted high-level metadata during video recording.
Shwetak N. Patel, Gregory D. Abowd