Boolean evaluations allow you to assess the quality of your LLM outputs using a simple true/false criteria. This type of evaluation is helpful when you want to check if an output meets a specific condition or requirement.

Creating a boolean evaluation

To create a boolean evaluation:

  1. Go to the Evaluations tab in your project.
  2. Click on the Create evaluation button.
  3. Provide a name for your evaluation.
  4. Select Boolean as the evaluation type.

Writing the evaluation prompt

When creating a boolean evaluation, you need to ensure that your evaluation prompt returns a clear true/false answer. The output should be a JSON object with the following format:

{
  "result": <true_or_false>,
  "reason": <explanation_for_the_score>
}

Make sure to include this format in your evaluation prompt. If you’re not sure how to structure your prompt, you can use one of the provided templates as a reference.

Remember to clearly state the criteria for a true/false outcome in your prompt. For example, your prompt might include a line like:

“Please evaluate whether the following output meets the specified criteria. Return true if it does, and false if it doesn’t.”

This helps maintain consistency in the evaluation process and ensures that the LLM understands how to assess the output.

Best Practices

  1. Be specific: Clearly define the criteria for a true or false result in your evaluation prompt.
  2. Keep it simple: Focus on one aspect or requirement per boolean evaluation for clarity.
  3. Use precise language: Avoid ambiguity in your prompts to ensure consistent evaluations.
  4. Combine with other evaluation types: Use boolean evaluations alongside numeric and text evaluations for a more comprehensive assessment of your LLM outputs.

By using boolean evaluations effectively, you can quickly identify whether your LLM outputs meet specific criteria and make data-driven decisions to improve your prompts and applications.