Harnessing Image Generation with the fofr/sdxl-abstract Cognitive Actions

24 Apr 2025
Harnessing Image Generation with the fofr/sdxl-abstract Cognitive Actions

In the world of AI-driven image generation, the fofr/sdxl-abstract specification offers powerful Cognitive Actions that allow developers to create and refine images based on descriptive text prompts. These pre-built actions enable seamless integration of advanced functionalities into applications, enhancing creative projects with minimal effort. By leveraging these capabilities, developers can automate complex image transformation processes and deliver stunning visual content.

Prerequisites

To get started with the Cognitive Actions from the fofr/sdxl-abstract specification, you'll need:

  • An API key for accessing the Cognitive Actions platform.
  • Basic knowledge of making HTTP requests and handling JSON data.
  • Familiarity with Python for the conceptual code examples provided.

Authentication typically involves passing your API key in the headers of your requests.

Cognitive Actions Overview

Generate Enhanced Images

Description:
This action generates and refines images based on a specified text prompt, utilizing advanced inpainting and img2img techniques. It offers customizable options for resolution, number of outputs, and various refinement styles, allowing for high-quality image generation.

Category: Image Generation

Input

The input schema for the Generate Enhanced Images action consists of several fields, each with specific requirements:

  • prompt (string): Descriptive text for image generation.
    Example: "A TOK abstract photo"
  • width (integer): Width of the output image in pixels.
    Default: 1024
    Example: 768
  • height (integer): Height of the output image in pixels.
    Default: 1024
    Example: 1152
  • refine (string): Style of refinement to apply. Options include:
    • no_refiner
    • expert_ensemble_refiner
    • base_image_refiner
      Default: no_refiner
      Example: "expert_ensemble_refiner"
  • loraScale (number): LoRA additive scale factor (between 0 and 1).
    Default: 0.6
    Example: 0.6
  • scheduler (string): Algorithm for the denoising process. Options include:
    • DDIM
    • DPMSolverMultistep
    • HeunDiscrete
    • KarrasDPM
    • K_EULER_ANCESTRAL
    • K_EULER
    • PNDM
      Default: K_EULER
      Example: "K_EULER"
  • guidanceScale (number): Scale for classifier-free guidance (between 1 and 50).
    Default: 7.5
    Example: 7.5
  • applyWatermark (boolean): Whether to apply a watermark to the generated images.
    Default: true
    Example: false
  • negativePrompt (string): Specifies undesired elements in the generated image.
    Default: ""
    Example: ""
  • promptStrength (number): Strength of the prompt when using img2img or inpainting (between 0 and 1).
    Default: 0.8
    Example: 0.8
  • numberOfOutputs (integer): Total number of images to generate (up to 4).
    Default: 1
    Example: 1
  • highNoiseFraction (number): Fraction of noise for refining (between 0 and 1).
    Default: 0.8
    Example: 0.8
  • numberOfInferenceSteps (integer): Total denoising steps in the generation process (between 1 and 500).
    Default: 50
    Example: 30

Output

The output of the Generate Enhanced Images action is a URL pointing to the generated image. For example:

[
  "https://assets.cognitiveactions.com/invocations/7f964f26-8516-4495-9693-8fd39816728f/e0b18bc1-2ad5-433c-a1da-509225171718.png"
]

Conceptual Usage Example (Python)

Below is a conceptual Python code snippet demonstrating how to call the Generate Enhanced Images action:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "9d4f1129-b548-4e79-aed0-21a79a1c66a7" # Action ID for Generate Enhanced Images

# Construct the input payload based on the action's requirements
payload = {
    "width": 768,
    "height": 1152,
    "prompt": "A TOK abstract photo",
    "refine": "expert_ensemble_refiner",
    "loraScale": 0.6,
    "scheduler": "K_EULER",
    "guidanceScale": 7.5,
    "applyWatermark": False,
    "negativePrompt": "",
    "promptStrength": 0.8,
    "numberOfOutputs": 1,
    "highNoiseFraction": 0.8,
    "numberOfInferenceSteps": 30
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The payload is structured to align with the input schema, ensuring the action is executed correctly.

Conclusion

The fofr/sdxl-abstract Cognitive Actions provide developers with robust tools for generating and refining images based on textual prompts. By integrating these actions into your applications, you can enhance user experiences and streamline creative processes. Explore various use cases, such as digital art creation, product visualization, or content generation, and unlock new possibilities in your projects.