This paper presents the application of a reinforcement learning (RL) approach for the near-optimal control of a re-entrant line manufacturing (RLM) model. The RL approach utilizes an algorithm based on a gradient-descent TD(λ) method to obtain both estimates of the optimal cost function and the control actions. Numerical experiments demonstrated the efficacy of the approach in estimating optimal actions by showing close approximations in performance w.r.t. the optimal strategy. Generalizations of the RL approach may have the advantage of scaling appropriately for RLM models with different dimensions in the state and action spaces.
José A. Ramírez-Hernández, Em