Posts

reward model

Last edited: August 8, 2025

feed both accepted and rejected into your model, and get two scalars out \(r_{\text{rejected}}\), and \(r_{\text{chosen}}\):

\begin{equation} \mathcal{L}_{RM} = \log \qty(1 + e^{r_{\text{rejected}}-r_{\text{chosen}}}) \end{equation}

  1. train only for one epoch
  2. you should be getting low accuracy scores
  3. you may need to ensemble, margin loss

  • ppo gets the best model

RFDiffusion

Last edited: August 8, 2025
  1. Starting with random residue noise: coordinates + backbones
  2. Diffusion happens: train like diffusion, with the goal of increasing binding affinities
  3. Eventually resolves to valid protein structures given the binding environments

Basically, start with only the desired substraight, and the diffuse the sequence around that small sequence with the goal of higher affinity binding: i.e. allow only the binding site to stay and regenerating the rest.

RFDiffusion is available starting THIS WEEK!

rho-POMDPs

Last edited: August 8, 2025

POMDPs to solve Active Sensing Problem: where gathering information is the explicit goal and not a means to do something. Meaning, we can’t train them using state-only reward functions (i.e. reward is based on belief and not state).

Directly reward the reduction of uncertainty: belief-based reward framework which you can just tack onto the existing solvers.

To do this, we want to define some reward directly over the belief space which assigns rewards based on uncertainty reduction:

Rice's Theorem

Last edited: August 8, 2025

For some predicate \(P\):

\begin{equation} P: \qty {TM} \to \qty {0,1} \end{equation}

think of \(0\) (false), \(1\) (true), where \(P\) satisfies:

  1. non trivial: there are Turing Machines \(M_{yes}\), \(M_{no}\) such that \(P(M_{yes}) = 1\), and \(P(M_{no}) = 0\)
  2. semantic: for all Turing Machines \(M_1\) and \(M_2\), if \(L(M_1) = L(M_2)\) then \(P(M_1) = P(M_2)\)

then, the language \(L = \qty {M \mid P(M) = 1}\) is undecidable.

to do this, check if \(P(M_{\emptyset}) = 0\) or \(P(M_{\emptyset}) = 1\); if the former, then ATM reduces to your language and your language isn’t decidable. If the latter, than not ATM reduces to your language and your language isn’t recognizable.

Richard Nixon

Last edited: August 8, 2025

Richard Nixon is an American president, but pretty much is the watergate guy.

  • Served in House and Senate
  • Eisenhower’s VP for 8 years
  • Lost first to JFK

Richard Nixon is a pragmatist; he pushes economy out of presession via Keynsian Politics.

Richard Nixon also realized that the large southern population can be motivated via racist policies, so he shifted the .

political positions of Richard Nixon