Overview

Latitude comes with a set of pre-configured evaluations to quickly get you started evaluating your LLM outputs.

Here’s the full list of evaluation templates Latitude comes preconfigured with:

  • Adaptability
    • Evaluate how well the response adapts to user preferences or context
  • Bias and Fairness
    • Assess whether the response is free of bias or unfair generalizations
  • Coherence and Fluency
    • Evaluate the clarity and flow of the response
  • Conciseness
    • Assess whether the response is brief but informative
  • Consistency
    • Check if the response is consistent with prior information or context
  • Creativity
    • Evaluate the originality and imagination shown in the response
  • Domain Expertise
    • Assess the response for accuracy and knowledge in a specific domain
  • Engagement or User Experience
    • Rate how well the response engages the user or enhances the conversation
  • Error Handling and Recovery
    • Evaluate how well the response corrects user errors or misunderstandings
  • Ethical Compliance
    • Determine if the response follows ethical standards
  • Explainability
    • Rate how clearly the response explains the concept or information
  • Factuality
    • Evaluates whether the following response is factually accurate
  • Faithfulness to Instructions
    • Assess how well the response follows the given instructions
  • Helpfulness and Informativeness
    • Rate how helpful and informative the response is
  • Formality and Style
    • Evaluate whether the response matches the desired formality or style
  • Hallucination Detection
    • Detect if the response introduces unsupported or false information
  • Harmlessness and Ethical Considerations
    • Check if the response promotes ethical and non-harmful behavior
  • Novelty
    • Assess the originality of the response in its content or style
  • Humor or Emotional Understanding
    • Rate whether the response appropriately uses humor or addresses emotional content
  • Helpfulness and Informativeness
    • Rate how helpful and informative the response is
  • Redundancy
    • Check if the response repeats information unnecessarily
  • Relevance
    • Rate how well the response addresses the given context or query
  • Response Time or Latency
    • Measure whether the response time is suitable for real-time interaction
  • Satisfaction
    • Rate overall satisfaction with the response
  • Specificity
    • Evaluate how specific and relevant the response is to the query
  • Long-Term Consistency (in Multi-turn Dialogues)
    • Check if the response remains consistent over multiple turns of dialogue
  • Novelty
    • Assess the originality of the response in its content or style
  • Persuasiveness
    • Rate how convincing the response is
  • Toxicity and Safety
    • Check if the response contains harmful or inappropriate content
  • Uncertainty or Confidence
    • Evaluate if the response expresses appropriate confidence or acknowledges uncertainty
  • Redundancy
    • Check if the response repeats information unnecessarily
  • Relevance
    • Rate how well the response addresses the given context or query
  • Response Time or Latency
    • Measure whether the response time is suitable for real-time interaction
  • Satisfaction
    • Rate overall satisfaction with the response
  • Specificity
    • Evaluate how specific and relevant the response is to the query
  • Toxicity and Safety
    • Check if the response contains harmful or inappropriate content
  • Uncertainty or Confidence
    • Evaluate if the response expresses appropriate confidence or acknowledges uncertainty

Custom evaluations

You can also create your own custom LLM-as-judge evaluations from scratch. Read the docs on custom evaluations to learn more.