expectation
Last edited: August 8, 2025expectation is the calculation of the “intended” or “target” value given a random variable:
\begin{equation} \mathbb{E}[M] = \sum_{x} x\ p(X=x) \end{equation}
- Standardize variables to \(z\) by dividing
- The correlation is simply their “product”: means of positive and negative groups
The expectation is the average of the counts of the data you have.
properties of expectation
these holds REGARDLESS of whether or not the variables you are doing is independent, IID, etc.
Linearity in the first slot
expectation has additivity and homogeneity.
explainability
Last edited: August 8, 2025Explainability is the study of, when stuff breaks, understanding why it does.
Here are a set of explainability techniques!
policy visualization
Roll your system out and look at it
Some common strategies that people use to do this:
- plot the policy: look at what the agent says to do at each state (if you have too many dimensions, just plot slices!)
- slicing: one way to deal with history-dependent trajectories is to then just count the number of actions that your system takes at each step, and plot the argmax of it
feature importance
Our goal is still to understand the contribution of various features to the overall behavior of a system.
explicit programming
Last edited: August 8, 2025- anticipate all states that the agent may find itself in
- hard-code responses to each one
This is bad because you have to have big brain to think about and anticipate all the possible states (to provide a “complete strategy”), which is often impractical if not impossible.
Disadvantages
You have to know the finite possible state space, and solve them “correctly”
Exploration and Exploitation
Last edited: August 8, 2025You are the president, and you are trying to choose the secretary of state. You can only interview people in sequence, and you have to hire on the spot. There are a known number of candidates. We want to maximize the probability of selecting the best candidate. You are given no priors.
How do we know which candidates we explore, and which candidates we exploit?
Sometimes, you don’t have a way of getting data.
exponential distribution
Last edited: August 8, 2025Analogous to poisson distribution, but for continuous random variable. Consider a distribution which lasts a duration of time until success; what’s the probability that success is found in some range of times:
“What’s the probability that there are an earthquake in \(k\) years if there’s on average \(2\) earthquakes in 1 year?”
constituents
- $λ$—“rate”: event rate (mean occurrence per time)
requirements
\begin{equation} f(x) = \begin{cases} \lambda e^{-\lambda x}, x\geq 0\\ 0, x< 0 \end{cases} \end{equation}
