In this paper, we present a knowledge-assisted approach to index and retrieve large volume of medical images. Both images and associated texts are indexed using medical concepts from the Unified Medical Language System (UMLS) metathesaurus. We propose a structured learning framework for modular acquisition of medical semantics from images with complementary global and local image indexing schemes. Two fusion approaches are also developed to improve text retrieval using the UMLS-based image indexing: a simple post-query fusion and a visual modality filtering to remove visually aberrant images according to the query modality concepts. On the ImageCLEFmed 2005 database, our framework outperformed our previous result which ranked top in the ImageCLEFmed 2005 Medical Image Retrieval task benchmark.