_index.org

modalization

Last edited: August 8, 2025

modalization is the

model bae

Last edited: August 8, 2025

model class

Last edited: August 8, 2025

Goal: we need to find a model that is “expressive enough”: we need to have enough parameters to help match the shape of the data we collect. to help match the shape of the data we collect.

constituents

requirements

additional information

selecting parameters

see model fitting

increasing expressiveness

mixure model

We could mix distributions into a . See Gaussian mixture model.

transforming distributions

Suppose you start with:

\begin{equation} Z \sim \mathcal{N}\qty(0,1) \end{equation}

we can sample \(k\) points \(k \sim Z\), and then transform them across a function \(x_{j}=f(k_{j})\). We now want to know the destruction of \(x_{j}\). Turns out, if \(f\) is invertible and differential, we have:

model-based reinforcement learning

Last edited: August 8, 2025

Step 1: Getting Model

We want a model

  • \(T\): transition probability
  • \(R\): rewards

Maximum Likelihood Parameter Learning Method

\begin{equation} N(s,a,s’) \end{equation}

which is the count of transitions from \(s,a\) to \(s’\) and increment it as \(s, a, s’\) gets observed. This makes, with Maximum Likelihood Parameter Learning:

\begin{equation} T(s’ | s,a) = \frac{N(s,a,s’)}{\sum_{s’’}^{} N(s,a,s’’)} \end{equation}

We also keep a table:

\begin{equation} p(s,a) \end{equation}

the sum of rewards when taking \(s,a\). To calculate a reward, we take the average: