Algorithms Index
Last edited: November 11, 2025Lectures
Divide and Conquer
Sorting
- merge sort: SU-CS161 SEP252025
- recurrence solving: SU-CS161 SEP302025
- median: SU-CS161 OCT022025
- randomized algos + quicksort: SU-CS161 OCT072025
- linear time sorting: SU-CS161 OCT092025
Data Structures
- red-black trees: SU-CS161 OCT142025
- hashing: SU-CS161 OCT212025
Graphs
- DFS/BFS: SU-CS161 OCT232025
- Strongly connected components: SU-CS161 OCT282025
- Dijikstra and Bellman-Ford: SU-CS161 OCT302025
DP
Greedy Algorithms
Closing
EMNLP2025 Index
Last edited: November 11, 2025Talks
Posters
Takes
- although parsing maybe dead for natural language, structure helps parse scientific information (i.e. drugs, molecules, proteins, etc.)
- two idea: 1) how to formalize approach mathematically 2) what can LMs do that humans can’t do?
- information-rich statefulness + constraints for pruning space is the unlock for ability to build on previous results; i.e. “critical thinking”
Tasks to Do
- EMNLP2025 Fan: medium is not the message: I wonder if we can remove keyboard based signals from BM25 using this method
- EMNLP2025 Xu: tree of prompting: a bunch of multi-hop retrieval datasets to benchmark forRAG-DOLL
Tasks Can Do
- EMNLP2025 Keynote: Heng Ji: “protein LLM requires early exit to capture dynamical Beauvoir”; what if we Mixture of Depth a protein LM?
- EMNLP2025 Hutson: measuring informative of open and questions: formalize this as a rho– POMDP , or use actual value of information measures with Belman backup
- EMNLP2025 Karamanolakis: interactive machine teaching: use MCTS UCB to pick the next set of constitutions to optimize for
EMNLP2025 Keynote: Heng Ji
Last edited: November 11, 2025Motivation: drug discovery is extremely slow and expensive; mostly modulating previous iterations of work.
Principles of Drug Discovery
- observation: acquire/fuse knowledge from multiple data modalities (sequence, stricture, etc.)
- think: critically generating actually new hypothesis — allowing iteratively
- allowing LMs to code-switch between moladities (i.e. fuse different modalities together in the most uniform way)
LM as a heuristic helps prune down search space quickly.
EMNLP2025 Tuesday Morning Posters
Last edited: November 11, 2025EMNLP2025 Xu: tree of prompting
Evaluate the quote attribution score as a way to prioritize more factual quotes.
EMNLP2025 Fan: medium is not the message
Unwanted feature such as language a medium who found in embedding, use linear concept of eraser to learn a projection that minimize information on unwanted features
EMNLP2025 Hong: variance sensitivity induces attention entropy collapse
Softmax is highly sensitive to variance which is why pre-training loss spikes without QK norm
SU-CS229 Midterm Sheet
Last edited: November 11, 2025- matrix calculus
- supervised learning
- gradient descent
- Newton’s Method
- regression
- \(y|x,\theta\) is linear Linear Regression
- What if \(X\) and \(y\) are not linearly related? Generalized Linear Model
- \(y|x, \theta\) can be any distribution that’s exponential family
- some exponential family distributinos: SU-CS229 Distribution Sheet
- classification
- take linear regression, squish it: logistic regression; for multi-class, use softmax \(p\qty(y=k|x) = \frac{\exp \theta_{k}^{T} x}{\sum_{j}^{} \exp \theta_{j}^{T} x}\)
- generative learning
- modeling each class’ distributions, and then check which one is more likely: GDA
- Naive Bayes
- bias variance tradeoff
- regularization
- unsupervised learning
- feature map and precomputing Kernel Trick
- Decision Tree
- boosting
