uncertainty
Last edited: August 8, 2025There are many different types of uncertainty.
- Outcome Uncertainty: actions may not have known results
- Model Uncertainty: best action in a state may not be known
- State Uncertainty: current state may not be precisely known
- Interaction Uncertainty: interference between models
unconc
Last edited: August 8, 2025underfit
Last edited: August 8, 2025Undirected Exploration
Last edited: August 8, 2025base epsilon-greedy:
- choose a random action with probability \(\epsilon\)
- otherwise, we choose the action with the best expectation \(\arg\max_{a} Q(s,a)\)
epsilon-greedy exploration with decay
Sometimes, approaches are suggested to decay \(\epsilon\) whereby, at each timestamp:
\begin{equation} \epsilon \leftarrow \alpha \epsilon \end{equation}
whereby \(\alpha \in (0,1)\) is called the “decay factor.”
Explore-then-commit
Select actions uniformly at random for \(k\) steps; then, go to greedy and stay there
