Posts

IBM704

Last edited: August 8, 2025

The IBM704 is the first mass-produced floating point computation computer (“pretty much the only computer that could handle complex math”); 12,000 floating point additions per second. IBM built 123 such machines between 1955-1960.

ICLR2025 Friday Posters

Last edited: August 8, 2025

ICLR2025 Morris: contextual document embeddings

Take a bunch of sentence embeddings as input to produce a new sentence embedding that is now contextual

ICLR2025 Noukhovich: asynchronous reinforcement learning for language models

Rollout and tune concurrently

ICLR2025 Yao: CR-CTC CONSISTENCY REGULATION

CTC LOSS CAN BE MADE MORE ROBUST IF YOU REGULARIZE TO HAVE MINIMAL DIFFERENCE BETWEEN TWO AUGMENTED VIEWS OF THE SAME MEL SPECTRUM

ICLR2025 Sun: ReDeEP detecting hallucination using mechanistic interpretability

Find layers most prone to insert information, measure the information insertion using logit lens before and after passing through FFN, strong change after hallucination prone FFN means hallucination

ICLR2025 HAIC

Last edited: August 8, 2025

ICLR2025 Koyejo

Proposal: Focus AI measurements on the validity of specific terms.

Five pillars of claim making:

  • content validity: does your evaluation cover all valuable cases?
  • criterion validity: does your evaluation correlate with a known validated standard?
  • construct validity: does your evaluation measure the intended construct?
  • external validity: does your evaluation generalize across different environments or settings?
  • consequential validity: does your evaluation consider the real world impact of test interpretation and use

Open problem: validaty of measurement for claims of HAIC.