SARS-COV2 Structural Analysis
Last edited: August 8, 2025- traditional stain techniques to analyze the epitopes being targeted
- uses cyro-EM structural analysis to figure structural points of neutralization
- predict correct antibodies binding to force certain structures to neutralize covid-19
- analyze mRNA-vax elicited antibodies to see similarity between those that are useful predicted in 3)
Study identified three epitopes: C1520, C1791, C1717, which changes the structure/activity of all three variants of concern as identified using methods above, and are inpervious to the mutation to the main supersite.
Sarsa (Lambda)
Last edited: August 8, 2025Sarsa (Lambda) is SARSA with Eligibility Traces (\(\lambda\)).
Previous approaches to deal with Partially Observable Markov Decision Process:
- memory-based state estimation (beliefs)
- special planning methods
Key question: Can we use MDP reinforcement learning to deal with POMDPs?
Background
Recall MDP SARSA:
\begin{equation} Q(s,a) \leftarrow Q(s,a) + \alpha \qty [(r + \gamma Q(s’, a’)) - Q(s,a)] \end{equation}
Recall that, sparse rewards with SARSA can take a long time to learn because it takes time to backpropgate.
Hence, we use Eligibility Traces, which keeps track of what’s “eligible” for updates:
SARSOP
Last edited: August 8, 2025Big problem: curse of dimensionality and the curse of history.
PBVI and HSVI tries to sample the belief simplex generally. But instead we should try to sample OPTIMAL REACHABLE SET.
Background
Recall one-step lookahead in POMDP. The difficulty here is that the sum over all of the alpha-vectors is still very hard. So, in PBVI, we only do this to a small set of beliefs
SARSOP
- sample \(R^{*}\)
- backup
- prune
Initialization
choose an initial belief, action, and observation using “suitable heuristics”. Initialize a set of alpha vectors corresponding to this belief.
SAT is in NP
Last edited: August 8, 2025Recall SAT is in NP because if \(\phi \in \text{SAT}\), then there is a short (poly-n space), efficiently (poly-time) checkable proof (by just reading out the satisfying assignment).
Satelite Assignment Problem
Last edited: August 8, 2025Goal: for a bunch of satellite with
\begin{equation} \alpha \qty(\beta) = \text{argmax}_{x \in X}\sum_{i=1}^{n} \sum_{j=1}^{m} \beta_{ij}x_{ij} \end{equation}
where there’s benefit matrix of Agent assigned to Task, \(\beta\). This is greedy and can be soled with Hungarian Method. But, this becomes hard when satellites MOVE and becomes sequential! and stuff starts running out of time: it becomes sequential with dependenices of past to future.
Solution: Multi-Agent RL. But, vanilla solution will conflict because the dominants strategy maybe the same for each agent.
