Cache

On this page

How Caching Works
Cache Conditions
Benefits
Cache Duration

Latitude implements a caching system for prompt responses to optimize performance and reduce costs. This guide explains how caching works and when it’s applied.

How Caching Works

When you execute a prompt, Latitude automatically caches the response if certain conditions are met. The cache key is generated based on:

The workspace ID
The prompt configuration
The conversation context

This means that identical prompts with the same parameters in the same workspace will return cached results.

Cache Conditions

Caching is only applied when:

The temperature is set to 0 or not specified
The prompt execution is successful

This is because non-zero temperatures introduce randomness in the responses, making caching less useful as each execution is intended to be unique.

Benefits

Caching provides several advantages:

Reduced Costs: Cached responses don’t consume additional API tokens
Faster Response Times: Cached results are returned immediately
Consistency: Identical prompts always return the same response

Cache Duration

Currently, cached responses are stored indefinitely. However, you can force a fresh execution by:

Modifying any part of the prompt configuration
Changing the conversation context
Using a non-zero temperature

Getting started

Prompts

Agents

Evaluations

Datasets

Experiments

Deployment

Self-Hosting

Changelog

Support

How Caching Works

Cache Conditions

Benefits

Cache Duration

Getting started

Prompts

Agents

Evaluations

Datasets

Experiments

Deployment

Self-Hosting

Changelog

Support

​How Caching Works

​Cache Conditions

​Benefits

​Cache Duration

How Caching Works

Cache Conditions

Benefits

Cache Duration