Social Reinforcement Learning
Last edited: August 8, 2025Key question: can multi-agent optimization problems help reinforcement learning stuff
using deep RL for combinatorial optimiazation
- fast inference scals well with instance size
- maybe difficult to actually discover optimal solution: high sample complexity, or failing to find good solutions
- doesn’t generalize well
why multi-agent works
- decentralized training to improve sample efficiency
- adversarial training
New Concepts
Important Results / Claims
Social Security Administration
Last edited: August 8, 2025Social Security Administration is a welfare program to directly give cash to those who are in need.
softmax
Last edited: August 8, 2025\begin{equation} softmax\qty(x_{i}) = \frac{\exp(x_{i})}{\sum_{j=1}^{n} \exp(x_{j})} \end{equation}
its “softmax”!
- max: amplifies the probability of the largest \(x_{i}\)
- soft: because it assign some probability to smaller \(x_{j}\) as well
Software Design and Architecture Patterns
Last edited: August 8, 2025:clap: What. Does. The. Client. Want.
Web Applications vs Local Application
- scale—what levels of functionality and access do we want
- training
- speed
SOLID principles
SOLID principles is a set of OOP principles; its kinda famous but encourages mindless braindead Java devs.
- Single Responsibility: that a class should have only one clearly defined thing it represents, and the class should only change IFF the underlying spec regarding that thing changes
- Easy pitfalls: mixing PERSISTENCE LOGIC with BUSINESS LOGIC (db should be moved to a separate class like ThingProvider/ThingPersistence)
- Open-Close Principle: classes should be easily extendable and closed to modification
- “we should be able to add new functionality without touching what’s written”
- so like interfaces are nice
- Liskov Substitution Principle: subclasses should act like base classes (and more); good inheritance systems should have this built in
- Interface Segregation Principle: you should build lots of interfaces + sub-interfaces based on what clients are and will need, such that a client only has to extend precisely the amount needed to do their job
- Dependency Inversion Principle: when possible, depend on abstract classes or interfaces and not their implementations
Dependency Injection
“Dependency Injection” is a 25-dollar term for a 5-cent concept. […] Dependency injection means giving an object its instance variables. […].
software dev starter pack
Last edited: August 8, 2025Here’s a bit of a guide to start in software development. It is mostly links to other resources that would help.
Introductory Remarks
Nobody “learns” software development. Even in job interviews, people expect you to have “worked” in software development. The industry, as a whole, drives via “learn-by-doing”, so its best to start thinking about what you want to achieve with software dev in terms of projects, then look specifically for resources to help you achieve those. Once you Google enough, et viola! You will have the skills needed to tackle another project.