convolution

Last edited: August 8, 2025

For \(f,g : \mathbb{R} \to \mathbb{C}\), we have:

\begin{equation} (f * g)(x) = \int_{\mathbb{R}} f(x-y) g(y) \dd{y} = \int_{\mathbb{R}} f(y) g(x-y) \dd{y} \end{equation}

properties of convolution

\((g * f) (x) = (f * g) (x)\)
\(\mathcal{F}(f * g) = \mathcal{F}(f)\mathcal{F}(g)\)
\(\mathcal{F}^{-1}(\hat{f} \hat{g}) = f * g\)
\((f * g)’ = f * g’ = f’ * g\)
\(\lambda ( f * g ) = (\lambda f) * g = f * (\lambda g)\)

=> “in a convolution, if ANY ONE of the two functions are Differentiable, both are Differentiable.”; think about smoothing a jagged function using a Gaussian.

Cook-Levin Theorem

Last edited: August 8, 2025

Cook-Levin Theorem states that SAT and 3SAT are NP-complete. That is, for NP language \(L \in NP\), we have \(L \leq_{p} SAT\), meaning

exists a poly-computable function \(R: \qty {0,1}^{*} \to \qty {0,1}^{*}\) to perform the polynomial time mapping reduction \(x \to \phi_{x}\) such that:

\begin{equation} x \in L \Leftrightarrow R(x) = \phi_{x} \in \text{SAT} \end{equation}

\(3SAT \in NP\)

see 3cnf-formula

\(3SAT\) is \(NP\) hard

We will give a polynomial time mapping reduction.

For every string \(w\), we want to convert it to a 3cnf-formula such that \(w \in A \in NP\) IFF \(f(w) = \phi \in 3SAT\).

Cookie Theft Picture Description Task

Last edited: August 8, 2025

Cookie Theft is a Discourse-Completion Task that involves describing the following picture:

cornucopia of analysis

Last edited: August 8, 2025

Pythagorean Theorem

\begin{equation} \|u + v\|^{2} = \|u \|^{2} + \|v\|^{2} \end{equation}

if \(v\) and \(u\) are orthogonal vectors.

Proof:

An Useful Orthogonal Decomposition

Suppose we have a vector \(u\), and another \(v\), both belonging to \(V\). We can decompose \(u\) as a sum of two vectors given a choice of \(v\): one a scalar multiple of \(v\), and another orthogonal to \(v\).

That is: we can write \(u = cv + w\), where \(c \in \mathbb{F}\) and \(w \in V\), such that \(\langle w,v \rangle = 0\).

corpus

Last edited: August 8, 2025

usually we use \(N\) to denote the number of tokens, and \(V\) the “vocab” or set of word types.

Corpora is usually considered in context of:

specific writers
at specific time
for specific varieties
of specific languages
for a specific function

Particularly hard: code switching, gender, demographics, variety, etc.

Herdan’s Law

\begin{equation} |V| = kN^{\beta} \end{equation}

with \(\beta\) being a constant between \(0.67 < \beta < 0.75\).

The vocab size is roughly proportional to the number of tokens.