Learn how to build a content moderation system that can analyze user-generated content and provide feedback on its appropriateness.
You can play with this example in the Latitude Playground.
In this example, we will create a content moderation system that can analyze user-generated content and provide feedback on its appropriateness. The agent uses subagents to handle different aspects of content moderation efficiently.
The system uses specialized subagents for different responsibilities:
All the tools used in the sub-agents have to be defined in the main prompt.
Let’s break down the example step by step to understand how it works.
Main Prompt
The main prompt acts as the central coordinator. It receives user-generated content, delegates the moderation tasks to the specialized subagents, aggregates their results, and produces a structured final decision with confidence and reasoning.
rule_checker
The rule_checker agent checks for clear, rule-based violations—like banned words, excessive length, or explicit policy breaches—using programmatic filters and deterministic logic.
toxicity_analyzer
The toxicity_analyzer (or toxicity_evaluator) uses advanced AI to evaluate whether the content contains toxicity, harassment, hate speech, or other forms of harmful language, considering nuance, context, and potential for implicit harm.
safety_scorer
The safety_scorer calculates various risk scores for the content, such as immediate harm, community impact, and escalation risk, and determines whether the situation requires human review or additional monitoring.
Final Decision
The main agent synthesizes all subagent outputs, weighing rule violations, toxicity, and risk scores to make a final moderation decision. This decision includes a confidence score, explanation, and recommended action for handling the content.
Main prompt returns a structured output because the moderation process must be machine-readable and reliable, allowing easy integration with other systems and clear auditing of every moderation decision.
In the code we prepared 4 cases of possible user input from different sources. In github you have the code but the idea is to launch this code with different types of possible input to see how it works.
The important part is that you can see the use of tools. The tools defined in the code are used to respond to the tools defined in the main prompt. These kind of tools are on your control and are things that usually don’t need an LLM or AI to be responded like measure the length of the text of if the the text contains words that yout put in a blacklist.