LLaMA
Last edited: August 8, 2025LLM for Teacher Feedback
Last edited: August 8, 2025Qualitative Changes in Teaching via LLMs
- no clear sign that there are qualitative changes via GPT
- no clear catering to students
important questions
- how to treanslate training into practice?
- how to cater to student needs?
- what to do with flawed assessments?
Teacher Training
Conventional Teacher Coaching
- not scalable, requires observation, and will give feedback
- not data driven, not adaptive—expertise is hard
AI powered coaching
- provide data-driven reflection opportunities
- can be personalized
- but not personalized with a human connection
Automated NLP Feedback
- talk time measurements
- reflection opportunities, NLP measurements, etc.
GPT wasn’t good at evaluating teachers
- GPT is gaslighting the teachers—rewording the existing work, so no novelty
- GPT is fairly faithful, the information is relevant, and
Punitive vs. Restorative Classroom Management
Classroom Management
- reducing use of exclusionary discipline
- improve classroom management to prevent escalation
- teachers feel stressed + under-prepared
Examples
“sit down now” vs. “do you need a break”
LLM MPC
Last edited: August 8, 2025Recall: MPC—pick some small horizon, define a cost function given the state, optimize
LLMs are fantastic search engines, so I built one
Last edited: August 8, 2025For the past 20 years, semantic indexing sucked.
For the most part, the core offerings of search products in the last while is divided into two categories:
- Full-text search things (i.e. every app in the face of the planet that stores text), which for the most part use something n-grammy like Okapi BM25 to do nice fuzzy string matching
- Ranking/Recommendation things, who isn’t so much trying to search a database as they are trying to guess the user’s intent and recommend them things from it
And we lived in a pretty happy world in which, depending on the application, developers chose one or the other to build.
LM Alignment
Last edited: August 8, 2025Been Kim
alignment problem involves “aligning” the representation spaces between machines of the world and that of the human. alternative perspective: teach humans new concepts to understand/communicate better
feature attribution doesn’t work
We take that perspective because many of the intersectional intepretability doesn’t work well (feature permutation, etc.)—feature attribution type analyses (“Impossibility Theorems Been Kim”) actually has no correlation with predictive results.
feature information store in models is unrelated to model edit success
i.e.: knowledge storing location located using ROME technique, though it gives you a sense of the location to store information, doens’t correlate to success of model editing.
