Harness Image Generation Power with leosy-kingdom's Cognitive Actions

22 Apr 2025
Harness Image Generation Power with leosy-kingdom's Cognitive Actions

In today’s digital landscape, the ability to generate and manipulate images programmatically is a game-changer for developers. The leosy-kingdom/leosy-earth2 spec introduces a powerful set of Cognitive Actions, particularly focused on image generation. Among its capabilities, the standout action—Generate Image with Inpainting—offers advanced options like image inpainting, custom aspect ratios, and style modifications. By leveraging these pre-built actions, developers can enhance their applications with sophisticated image generation features efficiently.

Prerequisites

Before diving into the implementation, ensure that you have the following:

  • An API key for accessing the Cognitive Actions platform.
  • Basic familiarity with RESTful APIs and JSON data structures.
  • A development environment set up to make HTTP requests (e.g., Python with the requests library).

Authentication typically involves passing your API key in the headers of your requests, which ensures secure access to the Cognitive Actions.

Cognitive Actions Overview

Generate Image with Inpainting

The Generate Image with Inpainting action allows users to create images based on a text prompt while supporting various advanced features such as image inpainting and custom styling through LoRA (Low-Rank Adaptation) intensity. This action falls under the image-generation category.

Input

To invoke this action, you need to provide a structured input payload. Below are the required and optional fields based on the input_schema:

  • Required:
    • prompt (string): Text description guiding image creation.
  • Optional:
    • mask (string): URI for an image mask used in inpainting.
    • seed (integer): Random seed for consistent image generation.
    • image (string): URI of an input image for transformations.
    • width (integer): Width of the generated image (only for custom aspect ratios).
    • height (integer): Height of the generated image (only for custom aspect ratios).
    • goFast (boolean): Enables faster predictions using an optimized model.
    • extraLora (string): Location of additional LoRA weights.
    • loraScale (number): Scaling factor for the main LoRA.
    • guidanceScale (number): Adjusts the guidance during image generation.
    • imageAspectRatio (string): Sets the aspect ratio of the output image.
    • imageOutputFormat (string): Format for the output image (e.g., webp, jpg, png).
    • numberOfOutputs (integer): How many output images to generate.
    • numberOfInferenceSteps (integer): Steps for denoising the image.

Example Input:

{
  "prompt": "Handheld camera, handsome man, romantic thick hair, thick hair, have the spirit of a CEO, wearing a office shirt, roll up long sleeve shirt, holding a flower",
  "loraScale": 1,
  "guidanceScale": 3.5,
  "extraLoraScale": 1,
  "inferenceModel": "dev",
  "promptStrength": 1,
  "numberOfOutputs": 2,
  "imageAspectRatio": "16:9",
  "imageOutputFormat": "png",
  "imageOutputQuality": 100,
  "numberOfInferenceSteps": 28
}

Output

The action typically returns a list of URLs pointing to the generated images. Here’s an example of the expected output:

Example Output:

[
  "https://assets.cognitiveactions.com/invocations/7bd2f7d6-a95c-4a3e-b93c-ddee2b52d78a/9d218d5b-bf88-4920-91b6-25b749bacc1b.png",
  "https://assets.cognitiveactions.com/invocations/7bd2f7d6-a95c-4a3e-b93c-ddee2b52d78a/e4871e7d-a2d9-4834-a07e-bee87003e084.png"
]

Conceptual Usage Example (Python)

Below is a conceptual Python code snippet that demonstrates how a developer might call the hypothetical Cognitive Actions execution endpoint:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "b40e95ed-2694-42b0-a482-c8fd084c3027" # Action ID for Generate Image with Inpainting

# Construct the input payload based on the action's requirements
payload = {
    "prompt": "Handheld camera, handsome man, romantic thick hair, thick hair, have the spirit of a CEO, wearing a office shirt, roll up long sleeve shirt, holding a flower",
    "loraScale": 1,
    "guidanceScale": 3.5,
    "extraLoraScale": 1,
    "inferenceModel": "dev",
    "promptStrength": 1,
    "numberOfOutputs": 2,
    "imageAspectRatio": "16:9",
    "imageOutputFormat": "png",
    "imageOutputQuality": 100,
    "numberOfInferenceSteps": 28
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The action_id variable holds the ID for the Generate Image with Inpainting action. The input payload is structured according to the action's requirements. The code then makes a POST request to the hypothetical execution endpoint, handling responses and errors appropriately.

Conclusion

The Generate Image with Inpainting action from the leosy-kingdom/leosy-earth2 spec empowers developers to easily integrate advanced image generation capabilities into their applications. By utilizing the provided input structure and understanding the output, you can create unique images tailored to your needs. Explore how you can implement this action in your projects and unleash your creativity! Happy coding!