Reinforcement learning is an area of
machine learning inspired by
behaviorist psychology, concerned with how
software agents ought to take
actions in an
environment so as to maximize some notion of cumulative
reward. The problem, due to its generality, is studied in many other disciplines, such as
game theory,
control theory,
operations research,
information theory, simulation-based optimization,
multi-agent systems,
swarm intelligence,
statistics, and
genetic algorithms. In the operations research and control literature, the field where reinforcement learning methods are studied is called
approximate dynamic programming. The problem has been studied in the
theory of optimal control, though most studies are concerned with the existence of optimal solutions and their characterization, and not with the learning or approximation aspects. In
economics and
game theory, reinforcement learning may be used to explain how equilibrium may arise under
bounded rationality.