Generate Stunning Images with the fofr/sdxl-vision-pro Cognitive Actions

21 Apr 2025
Generate Stunning Images with the fofr/sdxl-vision-pro Cognitive Actions

Creating visually captivating images has never been easier thanks to the powerful capabilities of the fofr/sdxl-vision-pro API. By leveraging Cognitive Actions, developers can seamlessly integrate image generation features into their applications. This article will guide you through the process of using the Generate Fine-Tuned Images on Apple Vision Pro action, detailing its features, input requirements, and output results.

Introduction

The fofr/sdxl-vision-pro API provides advanced image generation capabilities by utilizing a fine-tuned SDXL model specifically designed for Apple's Vision Pro. With features such as inpainting, prompt guidance, and various refinement styles, developers can customize the image generation process to achieve enhanced speed and accuracy. These pre-built actions save time and effort, allowing you to focus on building innovative applications.

Prerequisites

Before using Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform, which is necessary for authentication.
  • Basic familiarity with JSON payloads and the ability to make HTTP requests.

To authenticate, you'll typically pass your API key in the request headers.

Cognitive Actions Overview

Generate Fine-Tuned Images on Apple Vision Pro

This action allows you to generate images using a fine-tuned SDXL model on Apple's Vision Pro. It supports a variety of input parameters to control the image attributes, ensuring you can create images tailored to your specific needs.

Input: The input payload follows the schema outlined below:

{
  "mask": "string (uri)",
  "seed": "integer",
  "image": "string (uri)",
  "width": "integer, default: 1024",
  "height": "integer, default: 1024",
  "prompt": "string, default: 'An astronaut riding a rainbow unicorn'",
  "refine": "string, default: 'no_refiner'",
  "scheduler": "string, default: 'K_EULER'",
  "guidanceScale": "number, default: 7.5, range: [1, 50]",
  "applyWatermark": "boolean, default: true",
  "negativePrompt": "string",
  "promptStrength": "number, default: 0.8, range: [0, 1]",
  "numberOfOutputs": "integer, default: 1, range: [1, 4]",
  "refinementSteps": "integer",
  "highNoiseFraction": "number, default: 0.8, range: [0, 1]",
  "numInferenceSteps": "integer, default: 50, range: [1, 500]",
  "loraAdjustmentScale": "number, default: 0.6, range: [0, 1]"
}

Example Input: Here’s a practical example of the JSON payload needed to invoke the action:

{
  "width": 1024,
  "height": 1024,
  "prompt": "A photo of gandalf wearing a TOK VR headset, faces visible",
  "refine": "expert_ensemble_refiner",
  "scheduler": "K_EULER",
  "guidanceScale": 7.5,
  "applyWatermark": false,
  "promptStrength": 0.8,
  "numberOfOutputs": 1,
  "highNoiseFraction": 0.95,
  "numInferenceSteps": 50,
  "loraAdjustmentScale": 0.6
}

Output: The action typically returns a list of generated images. Here’s an example of the output you might receive:

[
  "https://assets.cognitiveactions.com/invocations/c2acb657-0cb6-496b-8cea-b1a00aa2ca2e/07147bf6-d83b-4402-817c-41dd537955aa.png"
]

Conceptual Usage Example (Python): Below is a conceptual Python code snippet to demonstrate how you might call this cognitive action:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "3bbb8328-f51e-42a5-99f3-dedad3bd8233"  # Action ID for Generate Fine-Tuned Images on Apple Vision Pro

# Construct the input payload based on the action's requirements
payload = {
    "width": 1024,
    "height": 1024,
    "prompt": "A photo of gandalf wearing a TOK VR headset, faces visible",
    "refine": "expert_ensemble_refiner",
    "scheduler": "K_EULER",
    "guidanceScale": 7.5,
    "applyWatermark": False,
    "promptStrength": 0.8,
    "numberOfOutputs": 1,
    "highNoiseFraction": 0.95,
    "numInferenceSteps": 50,
    "loraAdjustmentScale": 0.6
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code:

  • The action_id specifies the action we want to invoke.
  • The payload is constructed using the required input fields.
  • The API key is passed in the headers for authentication.

Conclusion

The Generate Fine-Tuned Images on Apple Vision Pro action empowers developers to create stunning images with extensive customization options. With the ability to fine-tune various parameters, you can achieve high-quality outputs tailored to your application’s needs. Next steps may include experimenting with different prompts, refining methods, or integrating this action into a larger workflow. Happy coding!