Training Helpful Chatbots
Last edited: August 8, 2025“What we have been building since ChatGPT at H4.
- No pretraining in any way
Basic Three Steps
Goal: “helpful, harmless, honest, and huggy” bots.
- Retraining step: large-scale next token prediction
- Incontext learning: few shot learning without updating parameters
- “Helpful” steps
- Taking supervised data to perform supervised fine tuning
- “Harmless” steps
- Training a classifier for result ranking
- RLHF
Benchmarking
Before we started to train, we have a problem. Most benchmarks are on generic reasoning, which evaluates 1), 2). Therefore, we need new metrics for steps 4) and 5).
training set
Last edited: August 8, 2025transformational generative syntax
Last edited: August 8, 2025the transformational generative syntax is a linguistical precept proposed by Noam Chomsky which has the interesting conclusion that meaning is supported by structure, rather than the other way around as generative semantics suggests.
This means that you can first come up with generic, independent structure to a sentence, then fill in the sentence with meaning.
For instance, “colorless green ideas sleep furiously” is a sentence Noam Chomsky proposes to have perfect structure but failes to be filled with meaning, supporting the transformational generative syntax theory.
Translation Studies Index
Last edited: August 8, 2025Translation Theory
Last edited: August 8, 2025Translation Theory is the theory that studies how translation works.
Spectrum of Translation
domestication and foreignization are processes by which a translator can choose to alter the style of a translation for a purpose.
foreignization
trying to bring the target language closer to the source language
- bring in foreign words
- use colourful idioms
- use old words
domestication
trying to bring he source language closer to the target language
