ML COVID Drug Discovery
Last edited: August 8, 2025Focus on protease: inhibition helps inhibit viral replication; and it is conserved across most coronaviruses; so good point to start working in drug development.
- Take smaller binding fragments covering the binding site, and combine them together
- Try to combine these fragments together into a molecule that fits well with the binding site
protease inhibition is usually achieved with a covalent peptide bond, but this crowd-sourcing effort showed that
machine-learning rapid library synthesis
- begin with some guess for the model molecule
- then, use ML to perform modifications to the molecule really quickly by scanning though (“ML-prioritized rapid library synthesis”) a bunch of changes to the molecule
- pick and repeat
Molecular Transformer
THROW THE FUCKING REACTION INTO AN LLM, as WORDS
ML Math Index
Last edited: August 8, 2025https://web.stanford.edu/class/cs205l/
Lectures
- SU-CS205L JAN072025
- SU-CS205L JAN092025
- SU-CS205L JAN142025
- SU-CS205L JAN162025
- SU-CS205L JAN212025
- SU-CS205L JAN232025
Logistics
MLE for Conditional Gaussian
Last edited: August 8, 2025Let’s say we want to find MLE parameters \(\theta\) for a conditional Gaussian with constant variance. That is:
\begin{equation} p\qty(y_{i} | x_{i}) = \mathcal{N} \qty(y_{i}|f_{\theta } \qty(x_{i}), \sigma^{2}) \end{equation}
and we have a corresponding dataset: \(\qty(x_1, y_1), …, \qty(x_{m}, y_{m})\).
where:
\begin{align} \hat{\theta} &= \arg\max_{\theta} \sum_{i=1}^{m} \log p\qty(y_{i}|x_{i}) \\ &= \arg\max_{\theta} \sum_{i=1}^{m} \log \mathcal{N} \qty(y_{i}| f_{\theta} \qty(x_{i}), \sigma^{2}) \\ &= \arg\max_{\theta } \sum_{i=1}^{m} \log \frac{1}{\sqrt{{2 \pi \sigma^{2}}}} \exp \qty(- \frac{\qty(y_{i}- f_{\theta }\qty(x_{i}))^{2}}{2\sigma^{2}}) \end{align}
MLib
Last edited: August 8, 2025MLib is a machine learning library built on top of Spark.
from pyspalk.mllib.clustering import KMeans
KMeans(rdd)
where you pass the MLib a PySpark RDD
