This paper introduces the Multimodal Multi-view Integrated Database (MMID), which holds human activities in presentation situations. MMID contains audio, video, human body motions, and transcripts, which are related to each other by their occurrence time. MMID accepts basic queries for the stored data. We can examine, by referring the retrieved data, how the different modalities are cooperatively and complementarily used in real situations. This examination over different situations is essential for understanding human behaviors, since they are heavily dependent on their contexts and personal characteristics. In this sense, MMID can serve as a basis for systematic or statistical analysis of those modalities, and it can be a good tool when we design an intelligent user interface system or a multimedia contents handling system. In this paper, we will present the database design and its possible applications.
Yuichi Nakamura, Yoshifumi Kimura, Y. Yu, Yuichi O