Skip to main content
Definition: Context Adherence is a measurement of closed-domain hallucinations: cases where your model said things that were not provided in the context. If a response is adherent to the context (i.e. it has a value of 1 or close to 1), it only contains information given in the context. If a response is not adherent (i.e. it has a value of 0 or close to 0), it’s likely to contain facts not included in the context provided to the model.

Context adherence with Luna-2

You can also leverage Galileo’s proprietary Evaluation SLMs to calculate context adherence. Context Adherence Luna is computed using Galileo in-house small language models (Luna-2). Context Adherence Luna is a cost-effective way to scale up your RAG evaluation workflows. To leverage Luna-2 for context adherence or other metrics, reach out to our team.

Performance Benchmarks

We evaluated Context Adherence against human expert labels on an internal dataset of RAG samples using top frontier models.
ModelF1 (True)
GPT-4.10.90
GPT-4.1-mini (judges=3)0.90
Claude Sonnet 4.50.89
Gemini 3 Flash0.89

GPT-4.1 Classification Report

PrecisionRecallF1-Score
False0.900.890.89
True0.890.900.90
Confusion Matrix (Normalized)
Predicted
True
False
Actual
True
0.898
0.102
False
0.108
0.892
0.0
1.0
Benchmarks based on internal evaluation dataset. Performance may vary by use case.
If you would like to dive deeper or start implementing Context Adherence, check out the following resources:

Examples

  • Context Adherence Examples - Log in and explore the “Context Adherence” Log Stream in the “Preset Metric Examples” Project to see this metric in action.