This paper proposes a system for an automatic detection of indoor scene events with interactive inquiry based on speech dialog and gesture recognition. he system detects the events that various objects are brought in or taken out by image recognition. The user of the system inquires the stored events in the past by pointing the objects or space and using speech dialog. Since automatic event detection may fail in complicated indoor scene, the system can use interactive inquiry to correct such failures.