Unlocking Creative Potential: Generate Images with the FOFR/SDXL-JWST Cognitive Action

22 Apr 2025
Unlocking Creative Potential: Generate Images with the FOFR/SDXL-JWST Cognitive Action

In today’s digital landscape, the ability to generate stunning visuals on demand is a game-changer for developers and creators alike. The FOFR/SDXL-JWST Cognitive Actions provide a powerful way to harness the capabilities of the SDXL JWST model for image generation. With features like img2img and inpainting, these pre-built actions allow you to create custom images tailored to your specific needs, making them an invaluable asset for any application.

Prerequisites

Before diving into the Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Familiarity with making HTTP requests, particularly for APIs.
  • Basic understanding of JSON structures.

Authentication typically involves passing your API key in the request headers, allowing you to securely interact with the Cognitive Actions.

Cognitive Actions Overview

Generate Image Using SDXL-JWST

Description: Generate images with custom settings, including mask input, refinement, and guidance, using the SDXL JWST model. This action supports img2img and inpaint conversions, enabling a wide range of creative outputs.

Category: Image Generation

Input: The input schema for this action includes several parameters, allowing for extensive customization of the generated images:

  • mask: (string, optional) URI of the input mask for inpaint mode where black areas will be preserved and white areas will be inpainted.
  • seed: (integer, optional) Random seed for generating outputs. Leave blank for a randomized seed.
  • image: (string, optional) URI of the input image for img2img or inpaint mode processing.
  • width: (integer, default: 1024) Width of the output image in pixels.
  • height: (integer, default: 1024) Height of the output image in pixels.
  • prompt: (string, default: "An astronaut riding a rainbow unicorn") Text prompt for the image generation to guide the output.
  • refine: (string, default: "no_refiner") The style of refinement to apply to the generated image.
  • loraScale: (number, default: 0.6) Scale factor for LoRA model adaptation, applicable only on trained models.
  • scheduler: (string, default: "K_EULER") Algorithm to use for scheduling the generation process.
  • guidanceScale: (number, default: 7.5) Classifier-free guidance scale.
  • applyWatermark: (boolean, default: true) Determines whether a watermark is applied to identify generated images in downstream applications.
  • negativePrompt: (string, optional) A prompt to guide what not to include in the generated image.
  • promptStrength: (number, default: 0.8) Strength of the prompt influence in img2img/inpaint mode.
  • numberOfOutputs: (integer, default: 1) Number of images to generate, between 1 and 4.
  • refinementSteps: (integer, optional) Number of refinement steps when using 'base_image_refiner'.
  • highNoiseFraction: (number, default: 0.8) Fraction of high noise used by 'expert_ensemble_refiner'.
  • numberOfInferenceSteps: (integer, default: 50) Total number of steps for the denoising process.

Example Input:

{
  "width": 1024,
  "height": 1024,
  "prompt": "a photo taken by TOK, astrophotography",
  "refine": "no_refiner",
  "loraScale": 0.6,
  "scheduler": "K_EULER",
  "guidanceScale": 7.5,
  "applyWatermark": true,
  "promptStrength": 0.8,
  "numberOfOutputs": 4,
  "highNoiseFraction": 0.8,
  "numberOfInferenceSteps": 50
}

Output: The action typically returns an array of generated image URIs. For example:

[
  "https://assets.cognitiveactions.com/invocations/c7203764-4c39-4dd1-be9f-4d79b88354c2/e467fe72-0579-4089-be79-d5fde967d120.png",
  "https://assets.cognitiveactions.com/invocations/c7203764-4c39-4dd1-be9f-4d79b88354c2/a7fc6589-d6df-4d1f-859c-3e17756f3e90.png",
  "https://assets.cognitiveactions.com/invocations/c7203764-4c39-4dd1-be9f-4d79b88354c2/86c5b127-d160-4c67-9252-3c467e402939.png",
  "https://assets.cognitiveactions.com/invocations/c7203764-4c39-4dd1-be9f-4d79b88354c2/5c753904-d7ef-4482-a286-04874fbbf34c.png"
]

Conceptual Usage Example (Python): Here is a conceptual example of how to call the action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "f90032c2-1857-4eca-a4fd-8e761ee69223" # Action ID for Generate Image Using SDXL-JWST

# Construct the input payload based on the action's requirements
payload = {
    "width": 1024,
    "height": 1024,
    "prompt": "a photo taken by TOK, astrophotography",
    "refine": "no_refiner",
    "loraScale": 0.6,
    "scheduler": "K_EULER",
    "guidanceScale": 7.5,
    "applyWatermark": True,
    "promptStrength": 0.8,
    "numberOfOutputs": 4,
    "highNoiseFraction": 0.8,
    "numberOfInferenceSteps": 50
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The payload variable is constructed according to the required input schema for the action, and the action ID is specified accordingly.

Conclusion

The FOFR/SDXL-JWST Cognitive Actions provide an innovative and flexible approach to image generation, empowering developers to create unique visuals with ease. By leveraging the capabilities of the SDXL JWST model, you can enhance your applications and explore exciting new creative possibilities. Start integrating these actions today and watch your projects come to life!