Skip to main content
Where this fits: Part of Refine. Expected output turns a dataset of real traces into a test set with a known-good answer for regression testing.
Expected output is the correct or desired answer for a dataset row. It lets a test compare your agent’s actual output against a known-good result, rather than only checking the output in isolation. Expected output is optional, since many evaluations check a response on its own merits, but it is what makes a row a precise regression case.

Add expected output to a row

1

Open the row

In a dataset, open a row to view its input, output, and fields.
2

Fill in the expected output

Add the correct answer in the Expected output field. A row with no expected output shows an Add expected output prompt, so it is easy to see which rows still need one.

Where the expected answer comes from

When you build a dataset from a failing signal, the agent’s actual output was wrong, which is why the signal exists. The expected output is the response the agent should have produced. Common sources:
  • the correct answer a human reviewer would give
  • the behaviour described in the signal or in an annotation
  • a corrected version of the original output

Next step