ICLR2025 Tokenizer-Free Approaches
Last edited: August 8, 2025Talks
Downsides of Subword Tokenization
- not learned end to end: vocab is fixed, can’t adapt to difficulty
- non-smoothness: similar inputs get mapped to very different token sequences
- [token][ization]
- typo: [token][zi][ation] <- suddenly bad despite small typo
- huge vocabs: yes
- non-adaptive compression ratio: you can’t decide how much to compress (affects FLOPs/document)
ICLR2025 Wu: Retrieval Head Explains Long Context
Last edited: August 8, 2025Motivation
Previous works contain “heads” that perform some specific mechanism from context retrieval.
Retrieval Head
Authors shows that Retrieval Heads exist in transformers: using Needle in a Haystack framework.
Key Insight
There exists certain heads which performs retrieval, as measured by the retrieval score.
Methods
Measuring Retrieval Behavior
“retrieval score”: how often does a head engage in copy-paste behavior.
- token inclusion: current generated token \(w\) is in the edle
- maximal attention: same token gives the maximum attenion score
ICLR2025 Yue: Inference Scaling for Long-Context RAG
Last edited: August 8, 2025“RAG performance can scale almost linearly w.r.t. log inference FLOPs”
Demonstration Based RAG (DRAG)
Method
Adding demonstrations as k in-context examples.
Prompt: documents, input query, final answer.
Parameters: number of documents, number of in context samples, number of iterations upper bound.
Iterative Demonstration Based RAG (IterDRAG)
Method
DRAG above, and then the model can generate a new sub-query. The model decides
Parameters: number of documents, number of in context samples, number of iterations upper bound.
identity
Last edited: August 8, 2025identities allows another number to retain its identity after an operation.
What identities are applicable is group dependent. Identities are almost always object dependent.
identity politics
Last edited: August 8, 2025<> NUS-HIST301 American History
The idea of identity politics is proposed, that politics became associated with sub-population of identities:
- Black Pride Movement
- Chicano Activism
- The American Indian movement
- Termination of reservation system
- Pan-Indian Rights
- Alcatraz and Wounded Knee Occupations
- LGBT movement
- Stonewall
- GLF starts marching
- Asian American
- Yellow Peril
- Model minority movement
- NOW Femanism Acts
- The Equal Rights Act almost possible, and then Phyllis Schlafly happened
- Environmental Movement
- Silent Spring
- Cuyahoga River on fire
- Richard Nixon creates the EPA
- Earth Day