Generate Stunning Images with wglint/4_sdxl Cognitive Actions

21 Apr 2025
Generate Stunning Images with wglint/4_sdxl Cognitive Actions

In the evolving landscape of artificial intelligence, image generation has become a pivotal technology, enabling developers to create unique visuals from textual descriptions. The wglint/4_sdxl specification introduces powerful Cognitive Actions that utilize the Stable Diffusion XL model 1.0. These actions not only generate high-quality images based on user-defined prompts but also offer various options for refinement and customization. This post will guide you through the key features of these actions, helping you integrate them into your applications effortlessly.

Prerequisites

Before you dive into using the Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform to authenticate your requests.
  • Familiarity with making HTTP requests in your programming environment.
  • A basic understanding of JSON, as the input and output formats will be in this structure.

To authenticate your requests, you will typically pass your API key in the headers of your HTTP calls.

Cognitive Actions Overview

Generate Refined Image with Stable Diffusion XL

The Generate Refined Image with Stable Diffusion XL action allows developers to create images from textual prompts using the Stable Diffusion XL model. You can refine these images for enhanced quality and control various parameters like dimensions, guidance scale, and scheduler options.

Input

The input schema for this action includes the following fields:

  • seed (integer): A seed value for random number generation, helps in producing consistent images. Default is 1334.
  • width (integer): The pixel width of the generated image. Default is 1024.
  • height (integer): The pixel height of the generated image. Default is 1024.
  • prompt (string): Text prompt guiding the image generation. Default is "A studio photo of a rainbow coloured cat."
  • refiner (boolean): Indicates whether to apply refinement to the image. Default is false.
  • scheduler (string): Select the scheduling algorithm to enhance image quality. Options include DDIM, DPMSolverMultistep, etc. Default is DDIM.
  • refinerNoise (number): Amount of noise added during the refinement process (0 to 1). Default is 0.8.
  • guidanceScale (number): Scale factor influencing adherence to the prompt. Default is 7.5.
  • negativePrompt (string): Elements to avoid in the image generation.
  • numberOfPictures (integer): Specifies how many images to generate (1 to 5). Default is 1.
  • numberOfInterferenceSteps (integer): Number of steps for the interference algorithm, impacting detail and quality. Default is 50.

Here is a practical example of the JSON payload needed to invoke this action:

{
  "seed": 1334,
  "width": 1024,
  "height": 1024,
  "prompt": "A studio photo of a rainbow coloured cat",
  "refiner": true,
  "scheduler": "DDIM",
  "refinerNoise": 0.8,
  "guidanceScale": 7.5,
  "negativePrompt": "",
  "numberOfPictures": 1,
  "numberOfInterferenceSteps": 50
}

Output

When successfully executed, this action returns an array containing the URLs of the generated images. Here’s an example of the expected output:

[
  "https://assets.cognitiveactions.com/invocations/dc8987af-018d-41ec-bf57-69e3222b9623/8c2fea1d-d89a-4f52-ab33-bd7928b3b2c7.png"
]

Conceptual Usage Example (Python)

Here’s how you might call the Generate Refined Image with Stable Diffusion XL action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "a1072a9e-202b-49f6-b87c-5fd191aa1161" # Action ID for Generate Refined Image with Stable Diffusion XL

# Construct the input payload based on the action's requirements
payload = {
    "seed": 1334,
    "width": 1024,
    "height": 1024,
    "prompt": "A studio photo of a rainbow coloured cat",
    "refiner": true,
    "scheduler": "DDIM",
    "refinerNoise": 0.8,
    "guidanceScale": 7.5,
    "negativePrompt": "",
    "numberOfPictures": 1,
    "numberOfInterferenceSteps": 50
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this Python code snippet, you replace the placeholders with your actual API key and endpoint. The action ID for the Generate Refined Image with Stable Diffusion XL action is included, and the input payload is structured according to the specifications.

Conclusion

The wglint/4_sdxl Cognitive Actions provide developers with a robust set of tools for image generation and refinement. By leveraging the capabilities of the Stable Diffusion XL model, you can create stunning visuals that align with your application needs. Whether you're looking to generate unique images for marketing, creative projects, or any other use case, these actions offer flexibility and quality. Start integrating these actions into your applications and explore the endless possibilities of AI-generated imagery!