Finding training data for AI is hard. So instead:
Intentional training data
- curated for training data
- Spent time thinking about bias, control, etc.
Training set of convenience
- Dataset that just comes about
- Problematic:
Accidentally introduce bias into the data: Googling images of CEOs, which is convenient, results in all white males for a bit.