Generate Stunning Images with the omarprama/xot Cognitive Actions

22 Apr 2025
Generate Stunning Images with the omarprama/xot Cognitive Actions

In the world of AI-driven creativity, the omarprama/xot specification opens up exciting possibilities for developers looking to integrate image generation capabilities into their applications. With its powerful Cognitive Actions, you can harness the potential of advanced models like 'dev' and 'schnell' to create highly customizable images. This blog post walks you through how to effectively use the Generate Image Prediction action to bring your creative concepts to life.

Prerequisites

Before diving in, ensure you have the following:

  • An API key for accessing the Cognitive Actions platform.
  • Basic knowledge of JSON structure and Python programming.
  • Familiarity with making HTTP requests (conceptually) to interact with an API.

For authentication, you'll typically pass your API key in the headers of your requests.

Cognitive Actions Overview

Generate Image Prediction

The Generate Image Prediction action allows you to create detailed images using customizable parameters such as aspect ratio, image quality, and more. You can optimize your image generation for either speed or quality based on your needs.

Input

The input for this action requires a structured JSON object. The only required field is prompt, while several others are optional for further customization:

  • prompt (string, required): The description of the image you want to generate.
    • Example: "Front view, professional photo shoot of a tall white ginger man wearing OXT, grey shorts, and flipflops"
  • imageFormat (string, optional): The format of the output image (e.g., webp, jpg, png).
  • outputCount (integer, optional): The number of images to generate (default is 1).
  • imageQuality (integer, optional): Quality of the output images ranging from 0 to 100.
  • guidanceScale (number, optional): Adjusts the realism of the image.
  • mainLoraScale (number, optional): Determines the strength of the main LoRA application.
  • extraLoraScale (number, optional): Determines the strength of the extra LoRA application.
  • inferenceModel (string, optional): Choose between dev or schnell models for generation.
  • promptStrength (number, optional): The influence of the prompt on the final result.
  • imageAspectRatio (string, optional): The aspect ratio for the image (default is 1:1).
  • numInferenceSteps (integer, optional): The number of denoising steps, impacting detail and generation time.

Here’s an example input JSON payload:

{
  "prompt": "Front view, professional photo shoot of a tall white ginger man wearing OXT, grey shorts, and flipflops",
  "imageFormat": "jpg",
  "outputCount": 1,
  "imageQuality": 90,
  "guidanceScale": 3.5,
  "mainLoraScale": 1,
  "extraLoraScale": 1,
  "inferenceModel": "dev",
  "promptStrength": 0.8,
  "imageAspectRatio": "1:1",
  "numInferenceSteps": 28
}

Output

The action returns a URL to the generated image. Here’s an example of what you might receive:

[
  "https://assets.cognitiveactions.com/invocations/80fb5d12-438b-47f2-9b67-c1eaf367b676/4a313d67-5d1c-4efe-a832-914f633278bd.jpg"
]

In case of issues, the response structure may also include error messages or status codes.

Conceptual Usage Example (Python)

Here’s how you might implement a call to the Generate Image Prediction action in Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "621387ef-e9a0-4365-b79b-a69daaf1c070"  # Action ID for Generate Image Prediction

# Construct the input payload based on the action's requirements
payload = {
    "prompt": "Front view, professional photo shoot of a tall white ginger man wearing OXT, grey shorts, and flipflops",
    "imageFormat": "jpg",
    "outputCount": 1,
    "imageQuality": 90,
    "guidanceScale": 3.5,
    "mainLoraScale": 1,
    "extraLoraScale": 1,
    "inferenceModel": "dev",
    "promptStrength": 0.8,
    "imageAspectRatio": "1:1",
    "numInferenceSteps": 28
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The action ID and input payload are structured based on the action's requirements.

Conclusion

The omarprama/xot Cognitive Actions offer powerful tools for developers to generate stunning images tailored to their specifications. By leveraging the Generate Image Prediction action, you can easily create visuals that enhance your applications or projects. Explore various parameters to find the perfect settings for your creative needs, and consider the potential of integrating these capabilities into your existing workflows. Happy coding!