modalization
Last edited: August 8, 2025modalization is the
model bae
Last edited: August 8, 2025model class
Last edited: August 8, 2025Goal: we need to find a model that is “expressive enough”: we need to have enough parameters to help match the shape of the data we collect. to help match the shape of the data we collect.
constituents
requirements
additional information
selecting parameters
see model fitting
increasing expressiveness
mixure model
We could mix distributions into a . See Gaussian mixture model.
transforming distributions
Suppose you start with:
\begin{equation} Z \sim \mathcal{N}\qty(0,1) \end{equation}
we can sample \(k\) points \(k \sim Z\), and then transform them across a function \(x_{j}=f(k_{j})\). We now want to know the destruction of \(x_{j}\). Turns out, if \(f\) is invertible and differential, we have:
model fitting
Last edited: August 8, 2025model-based reinforcement learning
Last edited: August 8, 2025Step 1: Getting Model
We want a model
- \(T\): transition probability
- \(R\): rewards
Maximum Likelihood Parameter Learning Method
\begin{equation} N(s,a,s’) \end{equation}
which is the count of transitions from \(s,a\) to \(s’\) and increment it as \(s, a, s’\) gets observed. This makes, with Maximum Likelihood Parameter Learning:
\begin{equation} T(s’ | s,a) = \frac{N(s,a,s’)}{\sum_{s’’}^{} N(s,a,s’’)} \end{equation}
We also keep a table:
\begin{equation} p(s,a) \end{equation}
the sum of rewards when taking \(s,a\). To calculate a reward, we take the average:
