Problem: end-to-end analysis of biological interactions at all timescales is hard; womp womp. No relationship explicitly between sequence, crystallography, md, etc. Also, some of them have time, some of them are frozen, etc.
Solution: use ML to glue multiple scales’ analysis together, using ML to
story 1: proteins can be encoded as hierarchies
- protein functional behavior
- secondary structure/primary structure
- amino acids
- sequences!
Slicing through the embedding space of GenSLMs can be used to identify these larger scale things from just the sequence by looking at the “general area” it exists in the latest space.
story 2: tetrahedron tessellations and finite-element methods to analyze dynamics behavior
Resolving cyro-EM dynamics to be able to capture binding behavior
applications
- training GenSLMs can help identify covid variantts