In many Multi-Agent Systems (MAS), agents (even if selfinterested) need to cooperate in order to maximize their own utilities. Most of the multi-agent learning algorithms focus on one-shot games, whose rational optimal solution is a Nash Equilibrium. Many times, these solutions are no longer optimal in repeated interactions, where, in the long run, other more profitable (Pareto Efficient) equilibrium points emerge (e.g., in the iterated Prisoner's Dilemma). The goal of this work is to improve existing rational Reinforcement Learning (RL) algorithms, that typically learn the one-shot Nash Equilibrium solution, using design principles that foster the reaching of the Pareto Efficient equilibrium. In this paper we propose two principles (Change or Learn Fast and Change and Keep) aimed at improving cooperation among Q-learning (a popular RL algorithm) agents in self-play. Using MASD (Multi-Agent Social Dilemma), an n-player and m-action version of the iterated prisoner's dilemma,...