Posts

Social Reinforcement Learning

Last edited: August 8, 2025

Key question: can multi-agent optimization problems help reinforcement learning stuff

using deep RL for combinatorial optimiazation

  • fast inference scals well with instance size
  • maybe difficult to actually discover optimal solution: high sample complexity, or failing to find good solutions
  • doesn’t generalize well

why multi-agent works

  • decentralized training to improve sample efficiency
  • adversarial training

New Concepts

Important Results / Claims

Social Security Administration

Last edited: August 8, 2025

Social Security Administration is a welfare program to directly give cash to those who are in need.

softmax

Last edited: August 8, 2025

\begin{equation} softmax\qty(x_{i}) = \frac{\exp(x_{i})}{\sum_{j=1}^{n} \exp(x_{j})} \end{equation}

its “softmax”!

  • max: amplifies the probability of the largest \(x_{i}\)
  • soft: because it assign some probability to smaller \(x_{j}\) as well

Software Design and Architecture Patterns

Last edited: August 8, 2025

:clap: What. Does. The. Client. Want.

Web Applications vs Local Application

  • scale—what levels of functionality and access do we want
  • training
  • speed

SOLID principles

SOLID principles is a set of OOP principles; its kinda famous but encourages mindless braindead Java devs.

  • Single Responsibility: that a class should have only one clearly defined thing it represents, and the class should only change IFF the underlying spec regarding that thing changes
    • Easy pitfalls: mixing PERSISTENCE LOGIC with BUSINESS LOGIC (db should be moved to a separate class like ThingProvider/ThingPersistence)
  • Open-Close Principle: classes should be easily extendable and closed to modification
    • “we should be able to add new functionality without touching what’s written”
    • so like interfaces are nice
  • Liskov Substitution Principle: subclasses should act like base classes (and more); good inheritance systems should have this built in
  • Interface Segregation Principle: you should build lots of interfaces + sub-interfaces based on what clients are and will need, such that a client only has to extend precisely the amount needed to do their job
  • Dependency Inversion Principle: when possible, depend on abstract classes or interfaces and not their implementations

Dependency Injection

“Dependency Injection” is a 25-dollar term for a 5-cent concept. […] Dependency injection means giving an object its instance variables. […].

software dev starter pack

Last edited: August 8, 2025

Here’s a bit of a guide to start in software development. It is mostly links to other resources that would help.

Introductory Remarks

Nobody “learns” software development. Even in job interviews, people expect you to have “worked” in software development. The industry, as a whole, drives via “learn-by-doing”, so its best to start thinking about what you want to achieve with software dev in terms of projects, then look specifically for resources to help you achieve those. Once you Google enough, et viola! You will have the skills needed to tackle another project.