reward model
Last edited: August 8, 2025feed both accepted and rejected into your model, and get two scalars out \(r_{\text{rejected}}\), and \(r_{\text{chosen}}\):
\begin{equation} \mathcal{L}_{RM} = \log \qty(1 + e^{r_{\text{rejected}}-r_{\text{chosen}}}) \end{equation}
- train only for one epoch
- you should be getting low accuracy scores
- you may need to ensemble, margin loss
- ppo gets the best model
RFDiffusion
Last edited: August 8, 2025- Starting with random residue noise: coordinates + backbones
- Diffusion happens: train like diffusion, with the goal of increasing binding affinities
- Eventually resolves to valid protein structures given the binding environments
Basically, start with only the desired substraight, and the diffuse the sequence around that small sequence with the goal of higher affinity binding: i.e. allow only the binding site to stay and regenerating the rest.
RFDiffusion is available starting THIS WEEK!
rho-POMDPs
Last edited: August 8, 2025POMDPs to solve Active Sensing Problem: where gathering information is the explicit goal and not a means to do something. Meaning, we can’t train them using state-only reward functions (i.e. reward is based on belief and not state).
Directly reward the reduction of uncertainty: belief-based reward framework which you can just tack onto the existing solvers.
To do this, we want to define some reward directly over the belief space which assigns rewards based on uncertainty reduction:
Rice's Theorem
Last edited: August 8, 2025For some predicate \(P\):
\begin{equation} P: \qty {TM} \to \qty {0,1} \end{equation}
think of \(0\) (false), \(1\) (true), where \(P\) satisfies:
- non trivial: there are Turing Machines \(M_{yes}\), \(M_{no}\) such that \(P(M_{yes}) = 1\), and \(P(M_{no}) = 0\)
- semantic: for all Turing Machines \(M_1\) and \(M_2\), if \(L(M_1) = L(M_2)\) then \(P(M_1) = P(M_2)\)
then, the language \(L = \qty {M \mid P(M) = 1}\) is undecidable.
to do this, check if \(P(M_{\emptyset}) = 0\) or \(P(M_{\emptyset}) = 1\); if the former, then ATM reduces to your language and your language isn’t decidable. If the latter, than not ATM reduces to your language and your language isn’t recognizable.
Richard Nixon
Last edited: August 8, 2025Richard Nixon is an American president, but pretty much is the watergate guy.
- Served in House and Senate
- Eisenhower’s VP for 8 years
- Lost first to JFK
Richard Nixon is a pragmatist; he pushes economy out of presession via Keynsian Politics.
Richard Nixon also realized that the large southern population can be motivated via racist policies, so he shifted the .