One proposed approach to managing a large complex Smart Grid is through Broker Agents who buy electrical power from distributed producers, and also sell power to consumers, via a Tariff Market—a new market mechanism where Broker Agents publish concurrent bid and ask prices. A key challenge is the specification of the market strategy that the Broker Agents should use in order to earn profits while maintaining the market’s balance of supply and demand. Interestingly, previous work has shown that a Broker Agent can learn its strategy, using Markov Decision Processes (MDPs) and Qlearning, and outperform other Broker Agents that use predetermined or randomized strategies. In this work, we investigate the more representative scenario in which multiple Broker Agents, instead of a single one, are independently learning their strategies. Using a simulation environment based on real data, we find that Broker Agents who employ periodic increases in exploration achieve higher rewards. We a...
Prashant P. Reddy, Manuela M. Veloso