Elevate Your Image Generation with Cognitive Actions for ControlNet Inference

23 Apr 2025
Elevate Your Image Generation with Cognitive Actions for ControlNet Inference

In the world of image generation, the prompthunt/cog-sdxl-controlnet-inference API offers powerful Cognitive Actions to create stunning visuals with minimal effort. These pre-built actions streamline the process of generating images using the ControlNet model, allowing developers to harness advanced features like img2img and inpainting with a variety of customizable parameters. Whether you're looking to enhance creative projects or automate image generation tasks, these actions provide a robust solution that's easy to integrate.

Prerequisites

Before diving into the implementation of these Cognitive Actions, ensure you have the following prerequisites:

  • An API key for accessing the Cognitive Actions platform.
  • Basic knowledge of making API calls and working with JSON payloads.

Authentication typically involves passing your API key in the request headers, allowing you to securely access the functionalities provided by the platform.

Cognitive Actions Overview

Generate ControlNet Image

The Generate ControlNet Image action is designed to create images by processing a variety of input parameters, including prompts, images, and mask settings. This action supports both the img2img and inpainting functionalities, giving developers the flexibility to customize their output through various options such as guidance scale, inference steps, and safety checks.

Input

The input for this action requires a JSON payload with multiple fields. Below are the key parameters:

  • mask (string): URI of the input mask for inpaint mode.
  • seed (integer, optional): Integer seed for randomization (default: random).
  • image (string): URI of the input image for img2img or inpaint operations.
  • width (integer, default: 1024): Width of the output image in pixels.
  • height (integer, default: 1024): Height of the output image in pixels.
  • prompt (string): Describes the desired content of the output image.
  • outputImageCount (integer, default: 1): Number of output images to generate (1-4).
  • guidanceScale (number, default: 7.5): Scale factor for classifier-free guidance.
  • negativePrompt (string): Specifies elements that should be minimized in the output.

Here’s an example of a complete input payload:

{
  "seed": 1234,
  "image": "https://replicate.delivery/pbxt/JtBXIeco0ymI5gPV7hbpb3N8gjPGQePmadfUmXV6njkeM7E9/w1024.jpeg",
  "width": 1024,
  "height": 1024,
  "prompt": "((Moody portrait)) of TOK man wearing glasses dressed in arctic fashion against an arctic backdrop with iceberg influences, perfect eyes",
  "outputImageCount": 1,
  "guidanceScale": 3,
  "negativePrompt": "defined jawline, plastic, blurry, grainy, [deformed | disfigured], poorly drawn, [bad : wrong] anatomy, [extra | missing | floating | disconnected] limb, (mutated hands and fingers), blurry"
}

Output

Upon successful execution, the action returns a JSON array containing the URIs of the generated images. Here’s a sample output:

[
  "https://assets.cognitiveactions.com/invocations/456c5f27-360e-406e-a4eb-6fa6c7921660/4cd560ee-437a-4b43-88b7-bc8167bf532e.png"
]

Conceptual Usage Example (Python)

Below is a conceptual Python code snippet demonstrating how to call this action using a hypothetical Cognitive Actions execution endpoint:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "0a2c7fde-2b7f-4c24-a7a0-7b21ebd13bdb"  # Action ID for Generate ControlNet Image

# Construct the input payload based on the action's requirements
payload = {
    "seed": 1234,
    "image": "https://replicate.delivery/pbxt/JtBXIeco0ymI5gPV7hbpb3N8gjPGQePmadfUmXV6njkeM7E9/w1024.jpeg",
    "width": 1024,
    "height": 1024,
    "prompt": "((Moody portrait)) of TOK man wearing glasses dressed in arctic fashion against an arctic backdrop with iceberg influences, perfect eyes",
    "outputImageCount": 1,
    "guidanceScale": 3,
    "negativePrompt": "defined jawline, plastic, blurry, grainy, [deformed | disfigured], poorly drawn, [bad : wrong] anatomy, [extra | missing | floating | disconnected] limb, (mutated hands and fingers), blurry"
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet:

  • Replace the API key and endpoint with your actual credentials.
  • The payload is populated with the necessary input fields.
  • The code handles potential exceptions, ensuring a smooth debugging experience.

Conclusion

With the Generate ControlNet Image action from the prompthunt/cog-sdxl-controlnet-inference API, developers can easily create high-quality images tailored to their specific requirements. The flexibility offered by customizable parameters allows for unique and creative output. Start exploring the possibilities today and enhance your applications with advanced image generation capabilities!