advantage function
Last edited: August 8, 2025an advantage function is a method for scoring a policy based on how much additional value it provides compared to the greedy policy:
\begin{align} A(s,a) &= Q(s,a) - U(s) \\ &= Q(s,a) - \max_{a}Q(s,a) \end{align}
that is, how much does your policy’s action-value function differ from that of choosing the action that maximizes the utility.
For a greedy policy that just optimizes this exact metric, \(A =0\).
advertising
Last edited: August 8, 2025affine subset
Last edited: August 8, 2025an affine subset of \(V\) is a subset of \(V\) that is the sum of a vector and one of its subspace; that is, an affine subset of \(V\) is a subset of \(V\) of the form \(v+U\) for \(v \in V\) and subspace \(U \subset V\).
for \(v \in V\) and \(U \subset V\), an affine subset \(v+U\) is said to be parallel to \(U\).
that is, an affine subset for \(U \subset V\) and \(v \in V\):
affine transformation
Last edited: August 8, 2025In math, an affine transformation is a transformation that preserves lines and parallelism.
For instance, here is an affine transformation:
\begin{equation} U’(S) = mU(s) + b \end{equation}
where \(m > 0\), and \(b\) is unconstrained.
