AA228/CS238: Probability Review!
Last edited: September 9, 2025Random Variable
random variables takes on different values with different probabilities. Each value a random variable take on is an event.
For instance, here’s a random variable representing a die: \(X\). It can takes on the following values, with the following probabilities:
\begin{align} P(X=1) = \frac{1}{6}\\ P(X=2) = \frac{1}{6}\\ \dots \\ P(X=6) = \frac{1}{6} \end{align}
where each assignment \(X=k\) is what we refer to above as an event.
The set of assignments of a random variable and their associated probability is called a distribution: distributions “assigns probabilities to outcomes.” When we say a certain random variable \(X\) is “distributed” following a distribution \(D\), we say \(X \sim D\). Semantically, we say \(X\) is a \(D\) random variable.
nuuk scratchpad
Last edited: September 9, 2025knowledgebase testing page
Last edited: August 8, 2025Like a sound you hear That lingers in your ear But you can’t forget From sundown to sunset
It’s all in the air You hear it everywhere No matter what you do It’s gonna grab a hold on you California soul
\begin{equation} x_1^{(j)} = x_1^{(j-1)} + Attn\qty(x_{k}^{(j-1)}, \forall k) \end{equation}
\begin{equation} At_{x_{1}^{(j-1)}} = \text{softmax}\qty(\frac{q_{1} k_{j}, \forall j}{\sqrt{d_{\ \text{model}}}}) v_{j} \end{equation}
\begin{equation} At_{x_{1}^{(j-1)}} = \text{softmax}_{\text{top-k cliff}}\qty(\frac{q_{1} k_{j}, \forall j}{\sqrt{d_{\ \text{model}}}}) v_{j} \end{equation}