Effortlessly Generate Custom Images with the yesuel1/sante-ostris Cognitive Actions

23 Apr 2025
Effortlessly Generate Custom Images with the yesuel1/sante-ostris Cognitive Actions

In the realm of artificial intelligence and machine learning, image generation has taken center stage. The yesuel1/sante-ostris spec provides developers with powerful tools known as Cognitive Actions, specifically designed for generating and manipulating images. With the Generate Image with Inpainting action, developers can create high-quality images efficiently, utilizing advanced models for both speed and precision. These pre-built actions simplify the integration of image generation capabilities into your applications, allowing for creative possibilities that were once time-consuming or complex.

Prerequisites

Before diving into the Cognitive Actions, ensure you have the following:

  • An API key for accessing the Cognitive Actions platform.
  • Basic familiarity with making HTTP requests and handling JSON data.

Authentication generally involves passing your API key in the headers of your requests to access the Cognitive Actions functionalities.

Cognitive Actions Overview

Generate Image with Inpainting

The Generate Image with Inpainting action enables the creation of images through inpainting techniques, utilizing image-to-image processing. This action can be particularly useful when you want to modify existing images or generate images based on descriptive prompts. It supports various configurations, including image resolution, aspect ratios, and output formats.

Input:

The input for this action requires a JSON object that can include the following fields:

  • prompt (required): A descriptive string guiding the image generation.
  • mask (optional): URI of the image mask for inpainting mode.
  • seed (optional): Integer for random seed.
  • image (optional): URI of an existing image for image-to-image processing.
  • model (optional): Selects the inference model, default is dev.
  • width (optional): Custom width in pixels (256-1440).
  • height (optional): Custom height in pixels (256-1440).
  • fastMode (optional): Boolean to enable fast inference mode.
  • loraScale (optional): Scale factor for applying the main LoRA.
  • megapixels (optional): Approximate megapixels for the image.
  • aspectRatio (optional): Aspect ratio for the generated image.
  • imageFormat (optional): Format for the output image (webp, jpg, png).
  • outputCount (optional): Number of images to generate (1-4).
  • imageQuality (optional): Quality of output images (0-100).
  • guidanceScale (optional): Scale for the diffusion process (0-10).
  • additionalLora (optional): URI for additional LoRA weights.
  • promptStrength (optional): Strength of the prompt effect (0-1).
  • inferenceStepCount (optional): Number of denoising steps (1-50).
  • safetyCheckerDisabled (optional): Toggle for safety checks.

Example Input:

{
  "model": "dev",
  "prompt": "sante-ostris, the professional-looking man wearing a suit and glasses, holding a blank A4 paper horizontally with both hands. sante-kim has a straight posture and confident expression, with the A4 paper clearly being held in a landscape orientation. The background remains simple to keep the focus on the man and the paper, with the A4 sheet showing an empty space for text or drawing.",
  "loraScale": 1,
  "aspectRatio": "1:1",
  "imageFormat": "png",
  "outputCount": 2,
  "imageQuality": 90,
  "guidanceScale": 3.5,
  "promptStrength": 0.8,
  "inferenceStepCount": 28,
  "additionalLoraScale": 1
}

Output:

The action typically returns an array of URLs pointing to the generated images. Here's a sample output:

[
  "https://assets.cognitiveactions.com/invocations/086e8ff4-9640-45ed-ba25-c0b02021158d/c7ba594d-5b73-4ecc-b7c6-fef26bcf430a.png",
  "https://assets.cognitiveactions.com/invocations/086e8ff4-9640-45ed-ba25-c0b02021158d/834f9a45-f556-49b0-8d9d-b8303d9720a6.png"
]

Conceptual Usage Example (Python): Here’s a conceptual Python code snippet illustrating how to invoke this action:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "e2768cea-89a4-4398-8b33-584e968b74b7"  # Action ID for Generate Image with Inpainting

# Construct the input payload based on the action's requirements
payload = {
    "model": "dev",
    "prompt": "sante-ostris, the professional-looking man wearing a suit and glasses, holding a blank A4 paper horizontally with both hands. sante-kim has a straight posture and confident expression, with the A4 paper clearly being held in a landscape orientation. The background remains simple to keep the focus on the man and the paper, with the A4 sheet showing an empty space for text or drawing.",
    "loraScale": 1,
    "aspectRatio": "1:1",
    "imageFormat": "png",
    "outputCount": 2,
    "imageQuality": 90,
    "guidanceScale": 3.5,
    "promptStrength": 0.8,
    "inferenceStepCount": 28,
    "additionalLoraScale": 1
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this example, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The action ID for the Generate Image with Inpainting action is specified, and the input payload is structured according to the requirements defined above.

Conclusion

The yesuel1/sante-ostris Cognitive Actions, particularly the Generate Image with Inpainting action, provide developers with robust tools for image generation and manipulation. By leveraging these actions, you can easily incorporate advanced image processing capabilities into your applications, opening doors to creative expression and enhanced user experiences. Consider exploring more use cases or combining this action with other functionalities to maximize your application's potential!