PUBLICATIONS

On-line Trading of Exploration and Exploitation

Authors

Adam Kalai,

B Kappen,

JY Audibert,

C Szepesvári,

Publication date

Publisher

Total citations

Cited by

Description

Background: Trading exploration and exploitation plays a key role in a number of learning tasks. For example the bandit problem ([1],[2],[3],[4]) provides perhaps the simplest case in which we must decide a trade-off between pulling the arm that appears most advantageous and experimenting with arms for which we do not have accurate information. Similar issues arise in learning problems where the information received depends on the choices made by the learner. Examples include reinforcement learning and active learning, though similar issues also arise in other disciplines, for example sequential decision-making from statistics, optimal control from control theory, etc.

Publication

PUBLICATIONS

On-line Trading of Exploration and Exploitation

OptimalAI