Prompts

Prompts are the core building block of Latitude. They are the questions or tasks that you want to ask your model. Prompts can be as simple as a single sentence or as complex as a multi-step dialogue. You can use prompts to generate text, classify text, or perform any other task that your model is capable of.

In Latitude, prompts support parameters, chaining, shared snippets, and more advanced features like logic (if/else, loops) or version control.

To find out more about prompts, check out the Prompt manager guide.

Logs

Logs are the records of the interactions between your prompts and your model. They contain the input prompt, the model’s response, and any other metadata that you choose to include. Logs are essential for evaluations, monitoring your model’s performance, and debugging any issues that may arise.

In Latitude, logs are automatically captured and stored whenever you run a prompt (either manually or through an endpoint). You can learn more about logs in the Logs guide.

Evaluations

Evaluations allow you to assess your prompt performance on real production logs. Each prompt in your project has its own evaluations. You can evaluate your prompts for accuracy, fluency, or any other criteria that you choose. Evaluations are classified in three main techniques:

  • LLM-as-judge: Large language models are used to evaluate the output of other models. This is useful when the evaluated criteria is subjective and complex.
  • Programmatic Rules: Simple, algorithmic rules, that evaluate your prompt based on some metric. Perfect for ground truth testing and objective criterias, such as enforcing specific lengths or validating formats.
  • Human-in-the-loop: You (or your team) manually review the logs and evaluate them based on your criteria. This is ideal when you need human verification.

Evaluations also generate logs that you will eventually use to fine-tune models or improve your prompts. To learn more about evaluations, check out the Evaluations guide.

Datasets

You can use datasets to add data in bulk to your Latitude workspace. Datasets are useful when evaluating model performance at scale, as they allow you to mock real-world data and test your model’s performance in different scenarios by running batch evaluations.

To learn more about datasets, check out the Datasets guide.