Adaptive cognitive orthotics: combining reinforcement learning and constraint-based temporal reasoning