Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Estimation of Distribution Algorithms (EDAs) are a class of evolutionary algorithms that use machine learning techniques to solve optimization problems. Machine learning is used t...
The decomposition method is currently one of the major methods for solving the convex quadratic optimization problems being associated with support vector machines. For a special c...
—Machine learning is inherently a multiobjective task. Traditionally, however, either only one of the objectives is adopted as the cost function or multiple objectives are aggreg...
We consider incorporating action elimination procedures in reinforcement learning algorithms. We suggest a framework that is based on learning an upper and a lower estimates of th...