Generate Stunning Images with the asronline/sdxl-btas-titles Cognitive Actions

23 Apr 2025
Generate Stunning Images with the asronline/sdxl-btas-titles Cognitive Actions

In today's digital landscape, generating and manipulating images programmatically is becoming increasingly essential for developers. The asronline/sdxl-btas-titles specification provides a powerful set of Cognitive Actions that enable developers to create images from text prompts and refine them using advanced techniques. This guide will introduce you to one of the key actions available in this spec, showcasing how to harness its power for your applications.

Prerequisites

Before diving into the Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Basic knowledge of making HTTP requests and working with JSON.

To authenticate, you will typically pass your API key in the request headers, allowing secure access to the actions.

Cognitive Actions Overview

Generate Image with Inpainting or Img2Img

This action allows you to generate images based on a provided text prompt while offering options for inpainting and refinement methods. You can customize various parameters, including the image dimensions, scheduler method, and guidance scale.

Input

The input for this action is a JSON object that requires the following fields:

  • prompt: A text description of the desired image (e.g., "In the style of BTAS, a silhouette of a batman in a city with a moody red sky").
  • width: The width of the output image in pixels (default is 1024).
  • height: The height of the output image in pixels (default is 1024).
  • outputCount: The number of output images to generate (1-4, default is 1).
  • guidanceScale: A scale factor for classifier-free guidance (default is 7.5).

Optional fields include:

  • mask: URI for the input mask in inpaint mode.
  • seed: Integer value for randomization.
  • negativePrompt: Elements to avoid in the generated image.
  • refinementStyle: The method for refining images, with options like "no_refiner", "expert_ensemble_refiner", and "base_image_refiner".

Example Input JSON:

{
  "width": 1024,
  "height": 1024,
  "prompt": "In the style of BTAS, a silhouette of a batman in a city with a moody red sky",
  "loraScale": 0.6,
  "outputCount": 1,
  "guidanceScale": 7.5,
  "applyWatermark": true,
  "promptStrength": 0.8,
  "refinementStyle": "no_refiner",
  "highNoiseFraction": 0.8,
  "inferenceStepCount": 50,
  "schedulingAlgorithm": "K_EULER"
}

Output

The output of this action is typically a URL pointing to the generated image.

Example Output:

[
  "https://assets.cognitiveactions.com/invocations/bed31818-8edb-47ba-93d2-e7989c29914a/a3317654-31cc-4f26-8338-e71e2b7e16d7.png"
]

Conceptual Usage Example (Python)

Here's how a developer might invoke this action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "9017f3de-a243-4fd7-b66c-def3203c835d"  # Action ID for Generate Image with Inpainting or Img2Img

# Construct the input payload based on the action's requirements
payload = {
    "width": 1024,
    "height": 1024,
    "prompt": "In the style of BTAS, a silhouette of a batman in a city with a moody red sky",
    "loraScale": 0.6,
    "outputCount": 1,
    "guidanceScale": 7.5,
    "applyWatermark": True,
    "promptStrength": 0.8,
    "refinementStyle": "no_refiner",
    "highNoiseFraction": 0.8,
    "inferenceStepCount": 50,
    "schedulingAlgorithm": "K_EULER"
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The action ID is specified for generating an image, and the input payload is structured according to the requirements of the action.

Conclusion

The asronline/sdxl-btas-titles Cognitive Actions provide a robust framework for developers looking to generate and refine images programmatically. By utilizing the action described in this article, you can create stunning visuals tailored to your application's needs. Explore further possibilities by integrating additional parameters and refining your image generation workflows!