Search and review effectively

Search finds traces; Save search turns a query plus filters into a saved search you can revisit, assign, and track. The habits below help your team find the right cohorts and keep the resulting saved searches valuable as your agent changes. For how search itself works, see Search Overview and Saved Searches. Once a cohort is scoped, see Annotate traces effectively for review habits.

Pick the right surface for what you want to find

When both query and filters are active, a trace must match both to appear.

You want to find…	Use…	Example
Conversations that sound like something	Semantic query	`user frustrated about billing`
Traces that contain a specific phrase	Exact text query (`"..."`)	`"401 Unauthorized"`
Traces in a flow or environment	Filters / metadata	`metadata.flow = "checkout"`
Traces your app already tagged	Tags or status	`status = error`, tag `refund`
Known failure categories (jailbreak, refusal, tool errors)	Flaggers	(automatic, no query needed)

Empty results? Drop quoted text first, widen the time window, then loosen filters. Over-specific "exact strings" are the most common miss.

Keep saved searches small enough to finish

Before you click Save search, check that you could actually work through the matches and that they’re varied enough to learn from.

Small enough to finish. If the result set is in the thousands and you’re reviewing by hand, tighten filters or shorten the time range until a week of review feels realistic. You can always broaden later.
Varied enough to learn. Twenty different traces teach you more about your agent than two hundred near-identical ones. If every match looks the same, add a filter (model, metadata, span count) or tweak the query for edge cases.

Name saved searches for what’s in them

A good name describes the traces in the cohort, not the query syntax.

Less useful	More useful
`q: payment errors`	`Failed payments last 7 days`
`search v2`	`Checkout flows over 5 steps`
`jailbreak test`	`Jailbreak attempts without refusal`

When you save:

Run the query and filters until the result set looks right.
Save search with a name a teammate can understand without opening it.
Open matches from the trace detail view and annotate as you work through them.

Investigation vs review vs regression watch

The same feature serves three intents: Investigation (may stay unsaved)

Narrow time window, specific query.
Delete the saved search when done, or Save as new if you want a permanent cohort derived from what you learned.

Review (bounded work)

Stable query and filters so the cohort stays consistent between sessions.
Work matches until the team agrees you’re through it.
Leave the saved search in place if you might need to re-sample later.

Regression watch (ongoing)

Filters on metadata or tags that won’t break when wording changes (e.g. metadata.flow = "checkout", not a one-off phrase from a single bad trace).
For anything you want to keep an eye on, point a monitor at the saved search so Latitude alerts you when matching traces arrive again, instead of reopening it by hand.
Update or delete watches when the product changes. Stale saved searches just get in the way.

Saved searches don’t send notifications on their own. To get alerted when new matches show up, point a monitor at the saved search.

Update vs save as new

When a loaded saved search drifts from what you want:

Update saved search: same intent, refined scope (e.g. extend from 7 to 30 days, add a model filter everyone agrees on).
Save as new search: a related cohort (e.g. same checkout flow but errors only vs all outcomes).

Use Save as new when two teams need similar but different views. Use Update when everyone shares one definition.

Flaggers vs saved searches

Flaggers and saved searches both find traces, but for different jobs.

	Flaggers	Saved searches
Who finds matches	Latitude, on every completed trace	You, when you run or reopen the search
Best for	Known failure categories (jailbreak, frustration, tool errors)	Product-specific cohorts that flaggers don’t cover
Output	Automatic annotations	A bookmarked working set you annotate, export, or inspect manually
Configuration	Project settings (enable, sampling)	Query + filters + name

Use flaggers for the built-in failure types. Use saved searches for flows, metadata combinations, and regressions only your product can name. Most teams use both.

Send metadata and tags your future self will use

The easiest searches start in your app: fields and tags you’ll still recognize in a few months.

Stable keys: metadata.flow, metadata.environment, metadata.feature, not one-off debug strings.
Values you will filter on: If you care about “refunds over $100”, emit metadata.refund_tier = "high" (or a numeric field) rather than hoping the dollar amount appears in user messages.
Tags for cross-cutting flags: production, canary, beta-user, all easy filter targets alongside semantic search.

You don’t need a perfect schema on day one. Add fields when you find yourself re-running the same awkward query twice.

Keep saved searches up to date

Saved searches go stale when the product, model, or prompts change.

Reopen watches you still care about monthly and skim recent matches. No new matches for months often means update or delete.
Avoid duplicates: two names for the same cohort confuse the team and waste review effort.
Watch filter-only saves: filters with no query and no time limit can grow forever, so long-lived watches should lean on metadata that stays meaningful.

Common pitfalls

Searching for what only lives in tool results. Use metadata filters instead.
Over-literal quoting. "the user wants a refund because the order was damaged" must appear exactly; use semantic search plus a metadata filter if you have one.
Saving before the result set looks right. A saved search is only useful if its query and filters are correct, so check the matches before you save.
Giant unbounded saved searches. A cohort of 5,000 traces won’t get reviewed by hand; tighten it first.
Confusing flaggers with search. Flaggers annotate automatically; saved searches are for cohorts you scope and review yourself.
Expecting alerts. A saved search won’t email you when a new trace matches; point a monitor at it for that.

What teams often do

A clear owner in practice for each saved search under active review: one person works the cohort, even though everyone can see and open it.
Saved searches named for the area that knows them: checkout with payments, support flows with the team that ships them.
A quick pass on new matches: skim recent hits on the searches you watch before weekly planning.
Save as new instead of arguing: when two squads need slightly different views of the same flow, don’t overwrite a shared search.

Recommended pattern

Start with one or two saved searches per failure mode your team already cares about. Work matches through the trace detail view, and reopen them as part of a weekly habit. For the ones worth watching continuously, point a monitor at them so you’re alerted automatically; delete the searches that stop being useful.

Overview

Getting Started

Observe

Understand

Refine

Security and Compliance

Deployment

Development

More

Search and review effectively

Search and review effectively

Pick the right surface for what you want to find

Keep saved searches small enough to finish

Name saved searches for what’s in them

Investigation vs review vs regression watch

Update vs save as new

Flaggers vs saved searches

Send metadata and tags your future self will use

Keep saved searches up to date

Common pitfalls

What teams often do

Recommended pattern

​Search and review effectively

​Pick the right surface for what you want to find

​Keep saved searches small enough to finish

​Name saved searches for what’s in them

​Investigation vs review vs regression watch

​Update vs save as new

​Flaggers vs saved searches

​Send metadata and tags your future self will use

​Keep saved searches up to date

​Common pitfalls

​What teams often do

​Recommended pattern

Search and review effectively

Pick the right surface for what you want to find

Keep saved searches small enough to finish

Name saved searches for what’s in them

Investigation vs review vs regression watch

Update vs save as new

Flaggers vs saved searches

Send metadata and tags your future self will use

Keep saved searches up to date

Common pitfalls

What teams often do

Recommended pattern