Cache
Learn how Latitude uses caching to optimize your prompt executions
Latitude implements a caching system for prompt responses to optimize performance and reduce costs. This guide explains how caching works and when it’s applied.
How Caching Works
When you execute a prompt, Latitude automatically caches the response if certain conditions are met. The cache key is generated based on:
- The workspace ID
- The prompt configuration
- The conversation context
This means that identical prompts with the same parameters in the same workspace will return cached results.
Cache Conditions
Caching is only applied when:
- The temperature is set to 0 or not specified
- The prompt execution is successful
This is because non-zero temperatures introduce randomness in the responses, making caching less useful as each execution is intended to be unique.
Benefits
Caching provides several advantages:
- Reduced Costs: Cached responses don’t consume additional API tokens
- Faster Response Times: Cached results are returned immediately
- Consistency: Identical prompts always return the same response
Cache Duration
Currently, cached responses are stored indefinitely. However, you can force a fresh execution by:
- Modifying any part of the prompt configuration
- Changing the conversation context
- Using a non-zero temperature