This paper describes a document image analysis system using multiple agents working on a pyramid structure to separate text from graphics in the image. Text strings appear as different groupings of connected components at different resolution of the images. As such, the pyramid structure, which is a multi-resolution image representation, provides a natural means of identifying and grouping of character strings in the document at different levels of resolution. The pyramid structure is also amenable to parallel processing, where multiple agents in the system can individually and concurrently look for groups of connected components at appropriate levels. The agent-based pyramid operations do not require expensive feature analysis among different connected components to detect text strings as found in other existing works.