Lexicon

Lexicon are pre-labeled datasets which pre-organize words into features. They are useful when training data is sparse.

Instead of doing word counts, we compute each feature based on teh the token’s assigned label in the lexicon.