Overview

The configuration section of a prompt in Latitude is always required, and defines the way the prompt will be executed. It will always be located at the top of the prompt file, and is enclosed by three dashes (---). The configuration section is written in YAML format, and can include any key-value pairs supported by your LLM provider.

This section is enclosed between three dashes (---) and written in YAML format:

---
model: gpt-4o
temperature: 0.6
top_p: 0.9
---

Configuration Options

Provider (required)

The provider key specifies the LLM provider to use for the prompt. This key is always required in the configuration section, and must reference one of the providers you have configured in your Latitude workspace.

You can also select any available provider from the dropdown on top of the prompt editor.

Learn more about configuring providers.

Model (required)

The model key specifies the model to use for the prompt. The available models will depend on your LLM provider, and you can select from the available models in the dropdown on top of the prompt editor.

Additional Parameters

schema

Define the JSON schema for the response of the prompt.

Learn more about how to define the JSON schemas

tools

Configure the tools available to the AI assistant.

Learn more about configuring tools in your prompt

temperature

Temperature setting.

The value is passed through to the provider. The range depends on the provider and model. For most providers, 0 means almost deterministic results, and higher values mean more randomness.

It is recommended to set either temperature or topP, but not both.

Not configuring temperature, or setting it to 0 will cache the response for the prompt, and the same response will be returned when the prompt is called again with the same parameters.

Learn more about caching in Latitude

maxTokens

Maximum number of tokens to generate.

topP

Nucleus sampling.

The value is passed through to the provider. The range depends on the provider and model. For most providers, nucleus sampling is a number between 0 and 1. E.g. 0.1 would mean that only tokens with the top 10% probability mass are considered.

It is recommended to set either temperature or topP, but not both.

topK

Only sample from the top K options for each subsequent token.

Used to remove “long tail” low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

presencePenalty

The presence penalty affects the likelihood of the model to repeat information that is already in the prompt.

The value is passed through to the provider. The range depends on the provider and model. For most providers, 0 means no penalty.

frequencyPenalty

The frequency penalty affects the likelihood of the model to repeatedly use the same words or phrases.

The value is passed through to the provider. The range depends on the provider and model. For most providers, 0 means no penalty.

stopSequences

The stop sequences to use for stopping the text generation.

If set, the model will stop generating text when one of the stop sequences is generated. Providers may have limits on the number of stop sequences.

seed

It is the seed (integer) to use for random sampling. If set and supported by the model, calls will generate deterministic results.

maxRetries

Maximum number of retries. Set to 0 to disable retries. Default: 2.

abortSignal

An optional abort signal that can be used to cancel the call.

The abort signal can e.g. be forwarded from a user interface to cancel the call, or to define a timeout.

headers

Additional HTTP headers to be sent with the request. Only applicable for HTTP-based providers.

You can use the request headers to provide additional information to the provider, depending on what the provider supports. For example, some observability providers support headers such as Prompt-Id.