Information Retrival is trying to find material within large collections which is unstructured which satisfies an information need (of structured info).
Unstructured information has had a massive outburst after the millennium.
For ranked system, we can come up with a curve of precision-recall curve by selecting increasing \(k\), or mean average precision.
a set of documents—could by static, or dynamically added
retrieve documents with information relevant to the user’s information need + to complete a task
Stages of Interpolation
- user task => info need: we may not be looking for the right info
- info need => query: we may not be using the best methods to get the info we are looking for
“what’s wrong with grepping?”
- we cannot afford to do a linear search over web-scale data
- a “NOT” query is non-trivial
- no semantics
- we have no ranking, so we don’t know what’s the “best” document