Experiments
Overview
Learn about the key concepts of Latitude experiments
What are Experiments?
Experiments in Latitude are a feature that let you systematically test, evaluate, and compare different prompt configurations, model versions, and parameters (like temperature) across a dataset. This enables you to find out which prompts and models work best for your use case, using real, measurable results.
How Experiments Work
-
Run Location: You can run experiments directly from the Prompt Playground or from a Latitude Evaluation
-
Experiments Tab: Each prompt in Latitude has an Experiments tab, where you can compare results from different experiments side-by-side.
Experiment Components
- Prompt Variants: Test different prompt wordings, instructions, or templates.
- Model Versions:
Compare outputs from different models (e.g.,
gpt-4.1
,gpt-4.1-mini
). - Parameters: Adjust settings like temperature to influence model behavior.
- Evaluations: Attach evaluation metrics (e.g., accuracy, sentiment analysis) to automatically assess experiment outputs.
Running an Experiment
- Define Variants: Choose your prompt(s), model, and settings.
- Pick Evaluations: Select which evaluation metrics to run (optional).
- Select Dataset: Pick or generate a dataset to use for testing.
Click Run Experiment to execute, and Latitude will process each combination and display the results.
Comparing Experiments
- Use the Experiments tab to select and compare multiple experiment runs.
- Review metrics like accuracy, cost, duration, and token usage.
- See detailed results, including logs and evaluation scores, for each experiment.
Benefits
- Objective Comparison: Quickly see which prompts and models perform best on your tasks.
- Visual Analysis: Side-by-side results make differences easy to spot.
- Cost Tracking: Monitor token and cost usage for each variant.