Text Evaluations

Text evaluations allow you to assess the quality of your LLM outputs using free-form text responses. This type of evaluation is helpful when you need detailed, qualitative feedback on your outputs or when you want to capture complex assessments that can’t be easily represented by boolean or numeric values.

Creating a text evaluation

To create a text evaluation:

Go to the Evaluations tab in your project.
Click on the Create evaluation button.
Provide a name for your evaluation.
Select Text as the evaluation type.

Writing the evaluation prompt

When creating a text evaluation, you have the flexibility to structure your prompt and specify the desired output format according to your needs. Unlike boolean or numeric evaluations, text evaluations don’t require a specific JSON format for the output. You can request any type of textual response, such as:

Detailed analysis
Pros and cons
Suggestions for improvement
Comparisons to ideal outputs
Custom scoring systems

For example, your prompt might include instructions like: “Please provide a detailed evaluation of the following output. Include strengths, weaknesses, and suggestions for improvement.” or “Analyze the given output and provide a score from 1 to 5 for each of the following criteria: accuracy, relevance, and clarity. Explain your reasoning for each score.”

Best Practices

Be clear and specific: Clearly define what you want the evaluation to cover and how you want the response structured.
Consider using a consistent format: While not required, using a consistent format across your text evaluations can make it easier to analyze results.
Ask for explanations: Encourage detailed explanations to get more valuable insights from your evaluations.
Use rubrics: If applicable, provide a rubric or set of criteria to guide the evaluation process.
Combine with other evaluation types: Use text evaluations alongside boolean and numeric evaluations for a comprehensive assessment of your LLM outputs.

By using text evaluations effectively, you can gain rich, qualitative insights into your LLM outputs and identify nuanced areas for improvement in your prompts and applications.

Getting started

Prompts

Agents

Evaluations

Datasets

Experiments

Deployment

Self-Hosting

Support

Creating a text evaluation

Writing the evaluation prompt

Best Practices

Getting started

Prompts

Agents

Evaluations

Datasets

Experiments

Deployment

Self-Hosting

Support

​Creating a text evaluation

​Writing the evaluation prompt

​Best Practices

Creating a text evaluation

Writing the evaluation prompt

Best Practices