- Ensure your AI’s responses match your brand’s voice and tone.
 - Check that generated content is clear, concise, and appropriate for your audience.
 - Quantitatively measure the quality of generated text compared to human-written references.
 
| Name | Description | When to Use | Example Use Case | 
|---|---|---|---|
| Tone | Evaluates the emotional tone and style of the response. | When the style and tone of AI responses matter for your brand or user experience. | A luxury brand’s customer service chatbot that must maintain a sophisticated, professional tone consistent with the brand image. | 
| BLEU & ROUGE | Standard NLP metrics for evaluating text generation quality. These metrics are only available for experiments as they need ground truth set in your dataset.  | When you want to quantitatively assess the similarity between generated and reference texts. | Evaluating the quality of machine-translated or summarization outputs against human-written references. |