Response Quality Metrics

Response quality metrics help you measure how well your AI system answers user questions, follows instructions, and provides useful information. These metrics are key for building reliable, helpful, and user-friendly AI applications. Use these metrics when you want to:

Ensure your AI’s responses are factually correct and complete.
Check that the model follows instructions and uses retrieved information effectively.
Evaluate how well your system grounds answers in context or source material.

Below is a quick reference table of all response quality metrics:

Name	Description	Supported Nodes	When to Use	Example Use Case
Chunk Attribution Utilization	Assesses whether the response uses the retrieved chunks in its response, and properly attributes information to source documents.	Retriever span	When implementing RAG systems and want to ensure proper attribution and that retrieved information is used efficiently.	A legal research assistant that must cite specific cases and statutes when providing legal information.
Chunk Relevance	Measures whether each retrieved chunk contains information that could help answer the user’s query.	Retriever span	When evaluating the relevance of individual retrieved chunks to the query.	A RAG system that needs to ensure each retrieved document chunk contributes useful information toward answering user questions.
Completeness	Measures how thoroughly the response covers the relevant information available in the provided context	LLM span	When evaluating if responses fully address the user’s intent.	A healthcare chatbot, when provided with a patient’s medical record as context, must include all relevant critical information from that record in its response.
Context Adherence	Measures how well the response aligns with the provided context.	LLM span	When you want to ensure the model is grounding its responses in the provided context.	A financial advisor bot that must base investment recommendations on the client’s specific financial situation and goals.
Context Precision	Measures the percentage of relevant chunks in the retrieved context, weighted by their position in the retrieval order.	Retriever span	When evaluating the overall quality of your retrieval system’s results and ranking effectiveness.	A document search system that needs to ensure retrieved chunks are relevant and properly ranked by importance.
Context Relevance (Query Adherence)	Evaluates whether the retrieved context is relevant to the user’s query.	Retriever span	When assessing the quality of your retrieval system’s results.	An internal knowledge base search that retrieves company policies relevant to specific employee questions.
Correctness (factuality)	Evaluates the factual accuracy of information provided in the response.	LLM span	When accuracy of information is critical to your application.	A medical information system providing drug interaction details to healthcare professionals.
Ground Truth Adherence	Measures how well the response aligns with established ground truth. This metric is only available for experiments as it needs ground truth set in your dataset.	Trace	When evaluating model responses against known correct answers.	A customer service AI that must provide accurate product specifications from an official catalog.
Instruction Adherence	Assesses whether the model followed the instructions in your prompt template.	LLM span	When using complex prompts and need to verify the model is following all instructions.	A content generation system that must follow specific brand guidelines and formatting requirements.
Precision @ K	Measures the percentage of relevant chunks among the top K retrieved chunks at a specific rank position.	Retriever span	When determining the optimal number of chunks to retrieve (Top K) and evaluating ranking quality at specific positions.	A RAG system that needs to optimize retrieval parameters to balance between capturing all relevant chunks and avoiding irrelevant ones.

Overview

Get Started

Logging and Monitoring

Experiments

Runtime Protection

Metrics

Annotations

Integrations

Security

References