Assess Text Toxicity with the fofr/prompt-classifier Cognitive Actions

23 Apr 2025
Assess Text Toxicity with the fofr/prompt-classifier Cognitive Actions

In the realm of content moderation, ensuring the safety and appropriateness of user-generated prompts is crucial, especially in applications involving text-to-image generation. The "fofr/prompt-classifier" spec introduces powerful Cognitive Actions designed to help developers assess the toxicity of prompts effectively. With these pre-built actions, developers can easily integrate prompt toxicity evaluation into their applications, enhancing user experience and safety.

Prerequisites

Before you can start using the Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform to authenticate your requests.
  • Familiarity with JSON payload structures for sending requests and handling responses.

Authentication typically involves passing your API key in the request headers, ensuring secure access to the services.

Cognitive Actions Overview

Determine Prompt Toxicity

Description: This action assesses the toxicity of text-to-image prompts using a fine-tuned LLaMA-13B model, providing a safety ranking from 0 (safe) to 10 (toxic). It falls under the category of content moderation.

Input

The input for this action is structured as follows:

  • prompt (required): The input prompt that will be evaluated for toxicity.
  • seed (optional): Specifies the random seed for reproducibility. If left blank, a random seed is used.
  • topK (optional): Samples from the top K most probable tokens during decoding, with a default of 50.
  • topP (optional): Samples from the top P percentage of most probable tokens, with a default value of 0.9.
  • debug (optional): If enabled, it outputs debugging information in logs.
  • temperature (optional): Controls the randomness of generated outputs, with a default of 0.75.
  • maximumNewTokens (optional): Sets the maximum number of new tokens to generate (default is 128).
  • minimumNewTokens (optional): Specifies the minimum number of new tokens to generate. Set to -1 to disable.
  • terminationSequences (optional): A list of sequences where generation will stop.

Here’s an example of the input JSON payload:

{
  "topK": 50,
  "topP": 0.9,
  "debug": false,
  "prompt": "[PROMPT] a photo of a cat [/PROMPT] [SAFETY_RANKING]",
  "temperature": 0.75,
  "maximumNewTokens": 128,
  "minimumNewTokens": -1,
  "terminationSequences": "[/SAFETY_RANKING]"
}

Output

The output of this action typically returns an integer representing the toxicity ranking of the prompt, with 0 indicating a safe prompt and higher values indicating increasing levels of toxicity. For example, a response might look like:

0

Conceptual Usage Example (Python)

Here's how you might call the Cognitive Actions execution endpoint to determine prompt toxicity:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "1aeaea1b-204c-47a9-8a96-60092d05780c" # Action ID for Determine Prompt Toxicity

# Construct the input payload based on the action's requirements
payload = {
    "topK": 50,
    "topP": 0.9,
    "debug": False,
    "prompt": "[PROMPT] a photo of a cat [/PROMPT] [SAFETY_RANKING]",
    "temperature": 0.75,
    "maximumNewTokens": 128,
    "minimumNewTokens": -1,
    "terminationSequences": "[/SAFETY_RANKING]"
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace the API key and endpoint with your own. The action ID for "Determine Prompt Toxicity" should be used in the request, and the input payload is structured according to the defined schema.

Conclusion

The "fofr/prompt-classifier" Cognitive Actions offer developers a straightforward way to integrate prompt toxicity assessment into their applications. By utilizing the "Determine Prompt Toxicity" action, you can enhance content moderation capabilities, ensuring that user-generated prompts align with community standards. Consider exploring additional use cases where prompt evaluation can improve user interactions and content quality.