How Suggestions Are Generated
- Data Collection: The system gathers results from your completed evaluations (LLM-as-Judge, Programmatic Rules, and Manual Evaluations). Both batch and live evaluation results are considered.
- Pattern Analysis: Latitude analyzes these results, looking for correlations between prompt inputs, outputs, and evaluation scores. It identifies patterns where certain inputs lead to lower scores or specific failure modes.
- Suggestion Generation: Based on these patterns, an AI model generates concrete suggestions for modifying your prompt. These suggestions might involve:
- Rewording instructions for clarity.
- Adding context or constraints.
- Providing better examples (few-shot learning).
- Adjusting prompt structure.
- Modifying configuration parameters.
- Prioritization: Suggestions are often prioritized based on the potential impact on evaluation scores.
Suggestions become more insightful as more evaluation data is collected. Aim
for at least 20-30 evaluated logs for meaningful analysis, though more data is
generally better.
Viewing and Applying Suggestions
- Navigate to the Prompt: Open the prompt you want to improve in the editor.
- Check for Suggestions: If suggestions are available, a “Suggestions” button/indicator will appear at the bottom of the prompt editor.
- Review Suggestions: Clicking the button opens a panel displaying the generated suggestions. Each suggestion includes:
- The reasoning based on evaluation data (e.g., “Outputs often failed the ‘Conciseness’ evaluation for long inputs”).
- Clicking the “View” button will display a diff of the proposed changes to the current prompt.
- Apply or Dismiss: For each suggestion, you can:
- Apply: Automatically applies the suggested change to your current prompt draft.
- Dismiss: Ignores the suggestion.
Next Steps
- Ensure you have robust Evaluations set up.
- Regularly Run Evaluations to feed the Refiner.
- Learn about preparing data with Datasets.