We propose to use wearable computers and sensor systems to generate personal contextual annotations in audio visual recordings of meetings. In this paper we argue that such annotations are essential and effective to allow retrieval of relevant information from large audio-visual databases. The paper proposes several useful annotations that can be derived from cheap and unobtrusive sensors. It also describes a hardware platform designed to implement this concept and presents first experimental results.