Ensure Content Safety with Llama Guard 3 Cognitive Actions

In today’s digital landscape, ensuring the safety and appropriateness of content is paramount. The Llama Guard 3 Cognitive Actions, part of the lucataco/llama-guard-3-8b specification, provide developers with powerful tools for content moderation. By leveraging the fine-tuned Llama Guard 3 model, these actions classify the safety of user-generated content, helping applications maintain a safe environment across multiple languages. Let’s delve into the capabilities of these Cognitive Actions and how you can integrate them into your applications.
Prerequisites
Before you start using the Llama Guard 3 Cognitive Actions, ensure you have the following:
- An API key for the Cognitive Actions platform.
- Basic knowledge of making API requests and handling JSON data.
For authentication, you will typically pass your API key in the request headers. This allows you to securely access the Cognitive Actions.
Cognitive Actions Overview
Classify Content Safety
The Classify Content Safety action utilizes the Llama Guard 3 model to identify the safety of content. It evaluates both user prompts and assistant responses, categorizing them based on 14 safety labels aligned with the MLCommons hazard taxonomy. This is instrumental for effective content moderation.
Input
The input schema for this action requires the following fields:
- userMessage (string): The message input by the user that requires moderation.
- assistantResponse (string): The response generated by the assistant that needs to be classified.
Here’s an example input for this action:
{
"userMessage": "I forgot how to kill a process in Linux, can you help?"
}
Output
The output of the Classify Content Safety action is a string indicating the safety classification of the content. For example, the action may return:
"safe"
This indicates that the provided content is deemed safe for use.
Conceptual Usage Example (Python)
Below is a conceptual Python code snippet illustrating how developers might call the Classify Content Safety action:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "0ea05935-936a-44e0-aae8-4d07d29b9800" # Action ID for Classify Content Safety
# Construct the input payload based on the action's requirements
payload = {
"userMessage": "I forgot how to kill a process in Linux, can you help?"
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this snippet:
- The
action_idis set for the Classify Content Safety action. - The
payloadis constructed to align with the required input schema. - The script executes a POST request to the Cognitive Actions endpoint, handling potential errors gracefully.
Conclusion
The Llama Guard 3 Cognitive Actions provide an essential toolkit for developers looking to integrate content moderation capabilities into their applications. By employing the Classify Content Safety action, you can ensure that user interactions remain safe and appropriate. As you explore further, consider additional use cases where content safety is critical to enhancing user experience and maintaining compliance. Happy coding!