Reliable RL

Last edited: January 1, 2026

Thinking about advances in the capabilities of RL: Knowledge Discovery -> Reasoning (programming assistance) ->(ongoing)-> Robotics

Insight: as time goes on, the “risk-criticality” of our applications increase; yet, as risk critical scenarios increase, its harder to get data.

Reliable Feedback Loop

General desirable structure…

Verify (claims and requirements) => Safeguard (safe continuous deployment) => Generalize (via compositional generalization—incrementing adding behavior without loosing behavior) => Verify => …

Deal with Stochasticity

An RL algorithm is explicable, if, WHP, running on the same MDP with fixed randomness results in the same outcomes.

SU-EE364A JAN132026

Last edited: January 1, 2026

Key Sequence

Notation

New Concepts

Important Results / Claims

Questions

Interesting Factoids

SU-PHIL2 APR012025

Last edited: January 1, 2026

Challenge of moral philosophy: a system which resolves morality and ethics together.

tools

definitions
appearance vs. reality (descriptive vs. true values)
reflective equilibrium

morality

morality is paradimatically a set of rules/expectations (concerning character/motives/emotions) for right/wrong behavior.

A code of conduct that people “must” follow to…

regulate / guide interpersonal interactions
rules that concern…
- harm / benefit
- justice / fairness
- loyalty / obedience
- sanctity / purity

Key question of morality: what do we owe to each other? what do I owe other people?

SU-SOC175 JAN122025

Last edited: January 1, 2026

Transitional Economy

China shifted from a Soviet style economy to a market-style economy.

Why is China different?

more gradual, via family outlets
no simultaneous political transition
began at a lower-level of economic development
thus, China started in agriculture

East Asian Development Model

We see this in Singapore, HK, Japan, Korea, Taiwan, China.

land to the tiller: moving land away from landlords and giving it to producers
export orientation: moving from low-end products to high end export
financial repression: low interest rates (promote exports + make imports more expensive + allow lending), undervalue currency, capital controls on cross-border flows

Unlike Japan, etc., China had to import technology + expertise because their work is way behind; therefore they were interested in foreign direct investment

extended-value extension

Last edited: January 1, 2026

Suppose \(f\) is convex on \(\mathbb{R}^{n}\), with domain \(\text{dom } f\). We can make an extended-value extension:

\begin{equation} \tilde{f} \qty(x) = \begin{cases} f\qty(x), x \in \text{dom }f \\ \infty, \text{ otherwise } \end{cases} \end{equation}

This could simplify notation .