Generate Stunning Images with ControlNet 1.1 Cognitive Actions

23 Apr 2025
Generate Stunning Images with ControlNet 1.1 Cognitive Actions

In the realm of AI-powered image generation, the ControlNet 1.1 Cognitive Actions offer developers a powerful toolkit to create structured images based on various input conditions. This API enables you to harness the capabilities of ControlNet 1.1, providing enhanced robustness and quality for image creation tasks. By utilizing these pre-built actions, you can streamline your development process and achieve impressive results without needing to delve deeply into the complexities of image generation algorithms.

Prerequisites

Before you start integrating the ControlNet 1.1 Cognitive Actions into your application, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Basic knowledge of JSON and RESTful API concepts.
  • Familiarity with programming in Python (or your preferred language).

Authentication typically involves passing your API key in the request headers, ensuring that only authorized users can access the Cognitive Actions.

Cognitive Actions Overview

Generate Structured Image with ControlNet 1.1

The Generate Structured Image with ControlNet 1.1 action allows you to create structured images based on specified input parameters, including edge maps, depth maps, poses, and more. This action is part of the image-generation category and aims to enhance image quality and robustness compared to previous versions.

Input:

The input schema for this action is structured as follows:

  • imageUrl (required): A valid URI pointing to the input image.
  • prompt (required): A descriptive text prompt that guides the model in generating the desired content.
  • conditionStructure (required): The structure used to condition the model output (e.g., canny, depth, pose).
  • seed (optional): A random seed for consistent outputs.
  • scale (optional): Scale for classifier-free guidance, defaulting to 9.
  • steps (optional): Number of denoising steps, defaulting to 20.
  • negativePrompt (optional): Text that the model will avoid generating.
  • numberOfSamples (optional): Specifies how many samples to generate, defaulting to 1.
  • additionalPrompt (optional): Extra text appended to the main prompt for enhanced guidance.
  • squareImageResolution (optional): Resolution of the generated or processed square image, defaulting to 512.
  • lineDetectionLowThreshold (optional): Lower threshold for edge detection in canny conditions, defaulting to 100.
  • lineDetectionHighThreshold (optional): Higher threshold for edge detection in canny conditions, defaulting to 200.

Example Input:

{
  "scale": 9,
  "steps": 20,
  "prompt": "a photo of a brightly colored turtle",
  "imageUrl": "https://replicate.delivery/pbxt/IfYcADnrquHyFiJaLur1fgv6P4ZxKUyGgpuEWPamUAlF1VQI/user_1.png",
  "negativePrompt": "Longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality",
  "numberOfSamples": "1",
  "additionalPrompt": "Best quality, extremely detailed",
  "conditionStructure": "scribble",
  "squareImageResolution": "512",
  "lineDetectionLowThreshold": 100,
  "lineDetectionHighThreshold": 200
}

Output:

The action typically returns an array of generated image URLs. For example:

[
  "https://assets.cognitiveactions.com/invocations/4907b4a3-ecd3-46cf-8cb4-301e02999bd2/1a293d94-9382-45ed-972a-78a9e4bb651a.png"
]

Conceptual Usage Example (Python):

Here’s how you might call this action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "3afdb8e6-403b-4c68-823a-33d438d33fae" # Action ID for Generate Structured Image with ControlNet 1.1

# Construct the input payload based on the action's requirements
payload = {
    "scale": 9,
    "steps": 20,
    "prompt": "a photo of a brightly colored turtle",
    "imageUrl": "https://replicate.delivery/pbxt/IfYcADnrquHyFiJaLur1fgv6P4ZxKUyGgpuEWPamUAlF1VQI/user_1.png",
    "negativePrompt": "Longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality",
    "numberOfSamples": "1",
    "additionalPrompt": "Best quality, extremely detailed",
    "conditionStructure": "scribble",
    "squareImageResolution": "512",
    "lineDetectionLowThreshold": 100,
    "lineDetectionHighThreshold": 200
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace "YOUR_COGNITIVE_ACTIONS_API_KEY" with your actual API key. The action ID corresponds to the Generate Structured Image with ControlNet 1.1 action. The input payload is constructed based on the schema provided above.

Conclusion

The ControlNet 1.1 Cognitive Actions empower developers to generate high-quality structured images effortlessly, leveraging sophisticated algorithms without needing extensive expertise in image processing. By integrating these actions into your applications, you can enhance user experiences, automate workflows, and explore creative possibilities. Consider experimenting with various input configurations to discover the full potential of these powerful tools.