What is Context Precision?
Context Precision is a RAG evaluation metric that measures how much of the retrieved context is relevant to the user's question or expected answer.
How It Works
Context precision asks whether the evidence supplied to the model is mostly useful or mostly noise. High context precision means the retrieved chunks are relevant and do not distract the generator; low precision means the context window contains unrelated passages that can increase cost, confuse the model, or introduce unsupported claims. It is usually evaluated alongside context recall because a system can be precise but miss necessary evidence, or broad but noisy.
Key Characteristics
- Focuses on the relevance of retrieved context rather than the final answer alone
- Penalizes noisy, redundant, or off-topic chunks in the prompt
- Complements context recall, which measures whether required evidence was retrieved
- Can be judged by humans, reference answers, or LLM-as-judge pipelines
- Useful for tuning top-k, chunking, filtering, reranking, and query rewriting
Common Use Cases
- Evaluating whether RAG retrieval returns too much irrelevant context
- Comparing retriever and reranker configurations
- Detecting noisy chunks after changing chunk size or overlap
- Optimizing context-window usage and generation cost
- Building regression tests for retrieval quality
Example
Loading code...Frequently Asked Questions
What does low context precision mean?
It means the retriever is sending too much irrelevant or redundant evidence to the model, which can waste context and harm answer quality.
Can context precision be too high?
A system may look precise if it retrieves very little, but it can still fail if it misses required evidence. That is why context recall is also needed.
How is context precision improved?
Common levers include better chunking, metadata filters, reranking, lower top-k, query rewriting, and hybrid retrieval.
Is context precision the same as answer accuracy?
No. It measures retrieved evidence quality. The generator can still produce a bad answer from good context, or a lucky answer from poor context.