This course studies reinforcement learning methods to model and solve management science and marketing problems that involve an explicit trade-off between learning (exploration) and exploiting the information that has been already acquired (e.g., earning). In particular, we will focus on the class of reinforcement learning problems that can be described and modeled as multi-armed bandits. Applications include online advertising, website optimization, clinical trials, new product development, pricing, revenue management, and consumer search.
The 2022 edition of this course has a strong emphasis on algorithms and an applied nature. This course will give you competence to identify multi-armed bandit problems (MABs), properly model them, and identify appropriate methods to solve them. In addition to gaining competence in MABs in general, you will be exposed to the challenges and methods used to tackle online MABs, i.e., problems that need to be solved in real time.
Reinforcement Learning course focuses on using machine learning methods to model and solve problems relevant to management science problems – in particular, those problems involving machines that autonomously make decisions on the behalf of the modeler, as in online settings.
The course is based mainly on reinforcement learning (when we model states and transitions) and multi-armed bandits (when states are not modelled). We will focus on the design, solution, and implementation of learning methods for sequential decision-making under uncertainty. Sequential decision problems involve a trade-off between exploitation (acting on the information already collected) and exploration (gathering more information). These problems arise in many important domains, ranging from online advertising, clinical trials, website optimization, marketing campaign and revenue management.