This paper proposes a novel method to characterize the performance of autonomous agents in the Trading Agent Competition for Supply Chain Management (TAC-SCM). We create benchmarking tools that manipulate market environments to control the conditions and provide guidelines to test trading agents. Using these tools, we show how developers can inspect their agents and unveil behaviors that might otherwise have gone undiscovered.