Enhance Your Applications with Image Generation Using Yoshi Cognitive Actions

21 Apr 2025
Enhance Your Applications with Image Generation Using Yoshi Cognitive Actions

In the world of digital content creation, the ability to generate high-quality images programmatically can greatly enhance user experiences and streamline workflows. The Yoshi Cognitive Actions (from the spec jimmywong974/yoshi) provide developers with powerful tools to create images through advanced inpainting techniques. By leveraging these pre-built actions, you can focus on building innovative applications while integrating sophisticated image generation capabilities with ease.

Prerequisites

Before diving into the integration, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Basic knowledge of JSON and HTTP requests.
  • Familiarity with Python for testing your integration.

To authenticate your requests, you will need to include your API key in the request headers. This will allow you to access the Cognitive Actions securely.

Cognitive Actions Overview

Generate Enhanced Image with Inpainting

The Generate Enhanced Image with Inpainting action is designed to create images using either the 'schnell' or 'dev' models. This functionality not only improves performance but also allows you to customize various parameters, such as aspect ratio, output quality, and image resolution, for optimal results.

Input

The input for this action requires a JSON object with the following fields:

  • prompt (required): Text description of the desired image.
  • mask (optional): URI of the image mask for inpainting.
  • seed (optional): Integer seed for reproducibility.
  • image (optional): URI of an input image for transformation.
  • width (optional): Width in pixels (when aspect_ratio is 'custom').
  • height (optional): Height in pixels (when aspect_ratio is 'custom').
  • goFast (optional): Boolean to enable faster predictions.
  • aspectRatio (optional): Desired aspect ratio of the image.
  • imageFormat (optional): Format of the output image (webp, jpg, png).
  • numOutputs (optional): Number of images to generate.
  • guidanceScale (optional): Scale for the diffusion process.
  • outputQuality (optional): Quality of the generated image.
  • inferenceModel (optional): Model type to use for inference.

Example Input:

{
  "prompt": "MCATYOSHI and MCATKIRBY leaping through a colorful array of fallen leaves in a sun-dappled park, dynamic action shot, shallow depth of field, rich saturated hues.",
  "extraLora": "jimmywong974/kirby",
  "loraScale": 1,
  "numOutputs": 1,
  "aspectRatio": "1:1",
  "imageFormat": "png",
  "guidanceScale": 3.5,
  "outputQuality": 80,
  "extraLoraScale": 1,
  "inferenceModel": "schnell",
  "numInferenceSteps": 28
}

Output

The action returns a JSON array containing the URIs of the generated images. Here's an example of a successful output:

Example Output:

[
  "https://assets.cognitiveactions.com/invocations/c45f78cc-cea2-4114-8b7e-f5ab585d33e5/a7b3fe2c-d8e2-4540-b706-9d5b54df00a8.png"
]

Conceptual Usage Example (Python)

Here’s a conceptual Python code snippet illustrating how to call this action:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "db990b72-56b5-45f9-a790-d6d8168bc7fa"  # Action ID for Generate Enhanced Image with Inpainting

# Construct the input payload based on the action's requirements
payload = {
    "prompt": "MCATYOSHI and MCATKIRBY leaping through a colorful array of fallen leaves in a sun-dappled park, dynamic action shot, shallow depth of field, rich saturated hues.",
    "extraLora": "jimmywong974/kirby",
    "loraScale": 1,
    "numOutputs": 1,
    "aspectRatio": "1:1",
    "imageFormat": "png",
    "guidanceScale": 3.5,
    "outputQuality": 80,
    "extraLoraScale": 1,
    "inferenceModel": "schnell",
    "numInferenceSteps": 28
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this example, you'll replace the placeholder API key and endpoint with your actual credentials. The input payload is structured according to the required fields, and the action ID corresponds to the specified action.

Conclusion

Integrating the Yoshi Cognitive Actions into your applications can significantly enhance your capabilities in image generation. By mastering the Generate Enhanced Image with Inpainting action, you can create visually stunning content tailored to your needs. Explore the possibilities, and consider other use cases where these actions can add value to your projects. Happy coding!