Content based search in audio-visual collections requires media specific analysis for extracting low level features to be efficiently indexed and searched. We present the SAPIR media framework for analyzing digital content and representing the extracted features in a common schema, used to index and search content in a P2P network. The framework contains splitters of compound objects into simple objects to deal with complex media like videos, using image and speech analyzers. We report usage of this framework in the SAPIR demo.