Partially observable Markov decision processes (POMDPs) provide a natural framework to design applications that continuously make decisions based on noisy sensor measurements. The recent proliferation of smart phones and other wearable devices leads to new applications where, unfortunately, energy efficiency becomes an issue. To circumvent energy requirements, finite-state controllers can be applied because they are computationally inexpensive to execute. Additionally, when multi-agent POMDPs (e.g. Dec-POMDPs or I-POMDPs) are taken into account, finite-state controllers become one of the most important policy representations. Online methods scale the best; however, they are energy demanding. Thus methods to optimize finite-state controllers are necessary. In this paper, we present a new, efficient approach to bounded policy interaction (BPI). BPI keeps the size of the controller small which is a desirable property for applications, especially on small devices. However, finding a...