## Key Sequence

## Notation

## New Concepts

- Markov Decision Process
- Bellman Residual
- for continuous state spaces: Approximate Value Function
- use global approximation or local approximation methods

## Important Results / Claims

- policy and utility
- creating a good utility function / policy from instantaneous rewards: either policy evaluation or value iteration
- creating a policy from a utility function: value-function policy (“choose the policy that takes the best valued action”)
- calculating the utility function a policy currently uses: use policy evaluation

- kernel smoothing
- value iteration, in practice