Learn how to create text evaluations for your prompts.
Text evaluations allow you to assess the quality of your LLM outputs using free-form text responses. This type of evaluation is helpful when you need detailed, qualitative feedback on your outputs or when you want to capture complex assessments that can’t be easily represented by boolean or numeric values.
When creating a text evaluation, you have the flexibility to structure your prompt and specify the desired output format according to your needs. Unlike boolean or numeric evaluations, text evaluations don’t require a specific JSON format for the output.You can request any type of textual response, such as:
Detailed analysis
Pros and cons
Suggestions for improvement
Comparisons to ideal outputs
Custom scoring systems
For example, your prompt might include instructions like:“Please provide a detailed evaluation of the following output. Include strengths, weaknesses, and suggestions for improvement.”or“Analyze the given output and provide a score from 1 to 5 for each of the following criteria: accuracy, relevance, and clarity. Explain your reasoning for each score.”
Be clear and specific: Clearly define what you want the evaluation to cover and how you want the response structured.
Consider using a consistent format: While not required, using a consistent format across your text evaluations can make it easier to analyze results.
Ask for explanations: Encourage detailed explanations to get more valuable insights from your evaluations.
Use rubrics: If applicable, provide a rubric or set of criteria to guide the evaluation process.
Combine with other evaluation types: Use text evaluations alongside boolean and numeric evaluations for a comprehensive assessment of your LLM outputs.
By using text evaluations effectively, you can gain rich, qualitative insights into your LLM outputs and identify nuanced areas for improvement in your prompts and applications.
Was this page helpful?
Assistant
Responses are generated using AI and may contain mistakes.