Generate Stunning Images with the hilongjw/sdxl-page Cognitive Actions

24 Apr 2025
Generate Stunning Images with the hilongjw/sdxl-page Cognitive Actions

In the world of AI-driven creativity, the hilongjw/sdxl-page spec opens up exciting possibilities for developers looking to integrate advanced image generation capabilities into their applications. This spec provides a set of powerful Cognitive Actions that allow you to generate and refine images based on prompts, masks, and various customization options. By leveraging these pre-built actions, developers can save time and resources while enhancing their applications with high-quality image synthesis.

Prerequisites

Before diving into the integration of the Cognitive Actions associated with the hilongjw/sdxl-page spec, ensure you have the following:

  • An API key for accessing the Cognitive Actions platform.
  • Basic knowledge of making HTTP requests and handling JSON data.
  • A working environment for executing Python code, including the requests library for making API calls.

To authenticate your requests, you'll typically pass the API key in the headers of your HTTP request.

Cognitive Actions Overview

Generate And Refine Image

The Generate And Refine Image action is designed to create images using a combination of an input image, a mask, and descriptive text prompts. This action also provides options for refining the generated images through various techniques, including LoRA scaling and different scheduler strategies. The customization capabilities allow developers to specify output dimensions, the number of images to generate, and more, ensuring high-quality results tailored to their needs.

Input

The action requires a structured input defined by the following schema:

{
  "mask": "string (uri)",
  "seed": "integer",
  "image": "string (uri)",
  "width": "integer (default: 1024)",
  "height": "integer (default: 1024)",
  "prompt": "string (default: 'An astronaut riding a rainbow unicorn')",
  "outputCount": "integer (default: 1, min: 1, max: 4)",
  "modelWeights": "string",
  "scheduleType": "string (enum: ['DDIM', 'DPMSolverMultistep', 'HeunDiscrete', 'KarrasDPM', 'K_EULER_ANCESTRAL', 'K_EULER', 'PNDM'], default: 'K_EULER')",
  "useWatermark": "boolean (default: true)",
  "inferenceSteps": "integer (default: 50, min: 1, max: 500)",
  "loraAdjustment": "number (default: 0.6, min: 0, max: 1)",
  "refinementSteps": "integer",
  "refinementStyle": "string (enum: ['no_refiner', 'expert_ensemble_refiner', 'base_image_refiner'], default: 'no_refiner')",
  "guidanceIntensity": "number (default: 7.5, min: 1, max: 50)",
  "highNoiseFraction": "number (default: 0.8, min: 0, max: 1)",
  "safetyCheckToggle": "boolean (default: false)",
  "negativeInputPrompt": "string",
  "inputPromptIntensity": "number (default: 0.8, min: 0, max: 1)"
}

Example Input:

Here’s an example of how the input JSON might look when invoking this action:

{
  "width": 512,
  "height": 2048,
  "prompt": "landing page screenshot of Nike",
  "outputCount": 1,
  "scheduleType": "K_EULER",
  "useWatermark": true,
  "inferenceSteps": 50,
  "loraAdjustment": 0.6,
  "refinementStyle": "no_refiner",
  "guidanceIntensity": 7.5,
  "highNoiseFraction": 0.8,
  "negativeInputPrompt": "",
  "inputPromptIntensity": 0.8
}

Output

Upon successful execution, the action typically returns a URL to the generated image. Here’s an example of the expected output:

[
  "https://assets.cognitiveactions.com/invocations/3bbf4548-66bd-48c7-ae2f-aace2b0bde63/970dbf08-867f-495b-bb92-0229e05bd76c.png"
]

Conceptual Usage Example (Python)

Here’s a conceptual example of how you might call the Generate And Refine Image action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "4e10b25e-1a9f-47d5-9b09-94baa405965a" # Action ID for Generate And Refine Image

# Construct the input payload based on the action's requirements
payload = {
    "width": 512,
    "height": 2048,
    "prompt": "landing page screenshot of Nike",
    "outputCount": 1,
    "scheduleType": "K_EULER",
    "useWatermark": True,
    "inferenceSteps": 50,
    "loraAdjustment": 0.6,
    "refinementStyle": "no_refiner",
    "guidanceIntensity": 7.5,
    "highNoiseFraction": 0.8,
    "negativeInputPrompt": "",
    "inputPromptIntensity": 0.8
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, you replace the placeholders with your actual API key and endpoint. The payload is constructed based on the input schema, and the request is sent to the Cognitive Actions API.

Conclusion

The hilongjw/sdxl-page Cognitive Actions empower developers to seamlessly integrate advanced image generation and refinement capabilities into their applications. By utilizing the Generate And Refine Image action, you can create stunning visuals tailored to your specifications, enhancing user engagement and experience. As you explore these actions further, consider the various customization options available to fully leverage the potential of AI-driven image synthesis in your projects. Happy coding!