_index.org

Training Helpful Chatbots

Last edited: August 8, 2025

“What we have been building since ChatGPT at H4.

  • No pretraining in any way

Basic Three Steps

Goal: “helpful, harmless, honest, and huggy” bots.

  1. Retraining step: large-scale next token prediction
  2. Incontext learning: few shot learning without updating parameters
  3. “Helpful” steps
    1. Taking supervised data to perform supervised fine tuning
  4. “Harmless” steps
    1. Training a classifier for result ranking
    2. RLHF

Benchmarking

Before we started to train, we have a problem. Most benchmarks are on generic reasoning, which evaluates 1), 2). Therefore, we need new metrics for steps 4) and 5).

training set

Last edited: August 8, 2025

transformational generative syntax

Last edited: August 8, 2025

the transformational generative syntax is a linguistical precept proposed by Noam Chomsky which has the interesting conclusion that meaning is supported by structure, rather than the other way around as generative semantics suggests.

This means that you can first come up with generic, independent structure to a sentence, then fill in the sentence with meaning.

For instance, “colorless green ideas sleep furiously” is a sentence Noam Chomsky proposes to have perfect structure but failes to be filled with meaning, supporting the transformational generative syntax theory.

Transformer Speech Diarization

Last edited: August 8, 2025

Background

Current deep-learning first approaches have shown promising results for the speech text diarization task. For ASR-independent diarization, specifically, two main methods appear as yielding fruitful conclusions:

  1. Auditory feature extraction using deep learning to create a trained, fixed-size latent representation via Mel-frequency cepstral coefficients slices that came from any existing voice-activity detection (VAD) scheme ((Snyder et al. 2018)), where the features extracted with the neural network are later used with traditional clustering and Variational Bayes refinement ((Sell et al. 2018; Landini et al. 2022)) approaches to produce groups of diarized speakers