What if you don’t know about a probability of success?

Beta Distribution time!!!

## Multi-Arm Bandit

Strategies:

- upper confidence bound: take the action with the highest confidence bound
- Posterior Sampling: take a sample from each Beta Distribution; take the action that has a higher probability of success based on their sample