Skip to main content
Latitude works best as a continuous loop for production agents: observe real traffic, understand what is going wrong, and refine your agent until it is fixed and stays fixed. The product is organized around that loop: Observe, Understand, Refine.

The core workflow

1

Connect telemetry (Observe)

Send traces from your agent to Latitude. Each interaction becomes a trace of spans (LLM calls, tool calls, retrieval, and more), and multi-turn conversations group into sessions. Send a userId and sessionId so you can also break activity down per user and review reliability, errors, and latency per tool.If you have not connected your app yet, follow Start tracing.
2

Find what matters (Understand)

Use Search to find conversations by meaning, exact text, or metadata filters: frustrated users, tool loops, hallucinations, failed workflows, or anything specific to your product. Behaviours goes further, automatically clustering your sessions into the topics users actually hit, so you discover patterns without writing a query.
3

Annotate what is good or bad (Understand)

Open traces from search results, behaviours, or the trace list and leave annotations. A thumbs-down with clear feedback tells Latitude this behaviour is worth tracking. Flaggers also annotate common failure categories automatically, such as frustration, refusal, jailbreaking, tool errors, and empty responses.
4

Let Latitude group failures into signals (Understand)

Failed annotations, flagger matches, evaluation failures, and custom scores become scores. Latitude groups similar failures into named, prioritized signals, each with example traces, affected-user counts, trends, and a lifecycle. Signals created from negative annotations are called issues.
5

Triage and monitor (Refine)

Triage signals on the Signals page: set priority, inspect example traces, and resolve noise. Monitors watch a signal, a saved search, a tool, or your raw traffic and open an incident when something needs attention, notifying you in-app, by email, or in Slack. Generate evaluations to keep scoring live traffic for the same failure.
6

Fix and prevent regressions (Refine)

Fix the underlying behaviour in your code, prompts, tools, retrieval, or product flow. Turn the failing traces into a dataset and regression test the fix, using the MCP server to drive it from your coding agent so the failure cannot quietly return. Resolve the signal once it is fixed, and the regressed monitor tells you if it comes back. Repeat the loop as new production traffic arrives.

What to focus on first

If you are setting up Latitude for the first time:
  1. Connect tracing for one production agent.
  2. Add userId and sessionId so traces group by user and conversation.
  3. Search for one failure mode your team already cares about, or browse Behaviours to see what stands out.
  4. Annotate representative traces with specific feedback.
  5. Watch the Signals page for grouped patterns, and let the built-in monitors notify you.
  6. Generate evaluations for the signals you want to score continuously.

Work with agents and self-host

  • MCP: manage your workspace straight from Claude, Cursor, and other agents through the MCP server.
  • Self-hosting: run Latitude in your own infrastructure, from a single host to a full cluster.

Why this works

Latitude does not require you to define every possible failure upfront. You discover failures from real traffic, validate them with human review, and turn important patterns into automated monitoring. Over time, the system becomes a living map of what goes wrong in your agent and whether your fixes are working.