We consider reinforcement learning in the parameterized setup, where the model is known to belong to a parameterized family of Markov Decision Processes (MDPs). We further impose ...
— We have found a more general formulation of the REINFORCE learning principle which had been proposed by R. J. Williams for the case of artificial neural networks with stochast...