Problem: end-to-end analysis of biological interactions at all timescales is hard; womp womp. No relationship explicitly between sequence, crystallography, md, etc. Also, some of them have time, some of them are frozen, etc.
Solution: use ML to glue multiple scales’ analysis together, using ML to
proteins can be encoded as hierarchies
- protein functional behavior
- secondary structure/primary structure
- amino acids
Slicing through the embedding space of GenSLMs can be used to identify these larger scale things from just the sequence by looking at the “general area” it exists in the latest space.