Course: Decision Making Over Time Under Uncertainty: Markov Decision Processes

Professor: Joachim Arts

ECTS: 2

Aims:

This course introduces students to the theory of Markov decision processes (MDPs). These processes are the archetypical model for making decisions over time under uncertainty. Decisions in supply chains and logistics must often be made under considerable uncertainty that will unfold over time. This course will equip students to model such decision problems as Markov decision processes and analyze them. In particular, they will study computational techniques to find numerical answers to decisions problems and analytical techniques to study structural properties of optimal decision rules. We will also give a brief outline of reinforcement learning.

Learning Objectives:

The student who followed this course

  • Can formulate a decision problem and correctly identify the state-space, action-space, transition laws and cost structure
  • Can use value iteration, policy iteration, and linear programming to numerically solve MDPs on a finite and infinite time horizon.
  • Understands how reinforcement learning is an approximate numerical solution technique for MDPs
  • Can prove structural properties of optimal policies by induction proofs on the value function. In particular, they can use the event-based dynamic programming framework.
  • Can read literature in scientific journals that use MDPs