Reinforcement learning in multi-dimensional state-action space using random rectangular coarse coding and Gibbs sampling