Make Models Go Brrr: Model Parallel Whisper Training

Last edited: August 8, 2025

Happy Monday friends.

The deliverable of the week was to make the a ASR model for Batchalign. Essentially, most copies of Whisper is pretty bad at Language Sample Analysis (LSA), because they mostly don’t work in terms trying to actually capture the things that people doing LSA want to capture (disfluencies, stuttering, etc.). OpenAI even acknowledged in the paper that they filtered out the disfluencies from their gold transcript to prevent Whisper from writing down too much of them.

Pipes are so bad

Last edited: August 8, 2025

I, for sure, spend equal if not time writing the data flywheel code compared to architectural changes.

This has been bugging me for awhile, but I haven’t had time to sit down and write about it until now. In my mind there are three things that makes coming up with a good take about this (and also implementing it) hard:

This may be a problem that’s solved in much of the frontier labs (I would argue more because of workflow optimizations that sidestep the problem, see below)
The “solution” to the problem also seem extremely person-dependent, so it seems hard to make an overall suggestion.
Many people write it off as “necessary engineering”

I don’t purport to have a solution to the pipes problem here, but I want to spend a little time now to reflect about what I have been thinking about while spending days at the office writing pipes.

Welcome to the Fireside

Last edited: August 8, 2025

Fireside a series of articles that I’m writing to consolidate my learning.

I have always dreamed of blogging. I have even tried once called 20MinuteRants. They worked quite well as a basic format whereby I can write about things in a fairly efficient manner (hence the 20 minutes), and be able to reflect about the things I’m up to.

The problem with the project is that I rarely had the motivation to do one. Once I was too busy, or out of ideas to write about, I stop. If there’s not anything to rant about, why is there a 20MinuteRant?

Why is building a to-do list app so darn hard?

Last edited: August 8, 2025

Why are Todo Lists (a.k.a. personal productivity systems) so hard to build well?

I’m genuinely curious. I was listening to the last episode of Cortex, and one of the hosts (CGP Grey) brought up a similar point regarding personal productivity platforms. OmniFocus, the reigning champion of the industry for professionals looking for a deeply customized system, has been staggering in their ability to ship the next version of their application. Much of the market consists of various different packagings of the same offering. Grey’s thesis of these platforms essentially boils down to this: