_index.org

stuff

Last edited: December 12, 2025

I used to maintain a website at interesting.jemoka.com filled with interesting things which is kind of neat. That died, but I still find a bunch of things interesting and so here’s a bunch of them.

POMDPs

Holy crap I can go on forever about them. My advisor Mykel is the biggest POMDP cheerleader of the west so it kinda rubbed off on me, but the fact that the answer to life, universe, and everything is

EMNLP2025 Index

Last edited: December 12, 2025

Talks

Posters

Takes

  • although parsing maybe dead for natural language, structure helps parse scientific information (i.e. drugs, molecules, proteins, etc.)
  • two idea: 1) how to formalize approach mathematically 2) what can LMs do that humans can’t do?
  • information-rich statefulness + constraints for pruning space is the unlock for ability to build on previous results; i.e. “critical thinking”

Tasks to Do

Tasks Can Do

mixed-autonomy traffic

Last edited: December 12, 2025

Vehicle Platooning

advantages:

  • reduce conjection
  • greater fuel economy

hierarchical control of platoons

  • vehicle: lat and long
  • link: formulation, splitting, reordering
  • network: planning, goods assignment, etc.

Goal: can we dynamically form platoons that minimizes travel time while minimizing fuel cost?

Traffic Coordination with MARL

To coordinate UAVs, we can formulate it as a Decentralized POMDP Problem. Key insight: rollout both you and a simulation of others.

Also run MPC with trurcated rollouts

MoE Review Index

Last edited: December 12, 2025

Project Thoughts

Overall Question: “Why is growing a better idea than training larger models from scratch?”

Cost of Specialization

Sub-Question: “how much does load balancing loss incur in terms of performance versus specialized data?”

  • For our goals, our data is much more specific (i.e. personalized), meaning we don’t necessarily need to rely on the ModuleFormer load balancing loss tricks.
  • Switch Transformers tells us that standard regularization, including dropout, means that one expert can be sufficient to answer many questions (perhaps 1+1 like in shared expert setups)

How Much Expert is an Expert?

Sub-Question: “do all experts have to have the same representational power?”

MOEReview Fedus: Switch Transformers

Last edited: December 12, 2025

At scale, with regularization (including dropout), k=1 on expert routing is fine!