Elevate Image Generation with the yasseryera/yasser Cognitive Actions

24 Apr 2025
Elevate Image Generation with the yasseryera/yasser Cognitive Actions

In the world of digital content creation, the ability to generate and manipulate images programmatically can be a game-changer. The yasseryera/yasser Cognitive Actions provide a powerful API for developers looking to harness advanced image generation techniques, particularly through inpainting. This set of actions enables developers to create customized images with fine-tuned controls, enhancing creativity and efficiency in applications. Let's dive into the capabilities of these actions and how to integrate them into your projects.

Prerequisites

Before you start using the Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform to authenticate your requests.
  • Familiarity with making HTTP requests, especially POST requests to an API endpoint.
  • Basic knowledge of JSON for structuring your requests.

Authentication typically involves passing your API key in the request headers, allowing you to securely access the capabilities of the Cognitive Actions.

Cognitive Actions Overview

Generate Image with Inpainting

Description:
This action generates images using an inpainting mode for image-to-image transformations. Developers can choose between models optimized for either speed or quality, set customizable image dimensions, and specify various output formats. The action also includes safety checks and LoRA intensity scaling, making it versatile for different use cases.

Category: image-generation

Input

The following fields are required and optional for the Generate Image with Inpainting action:

  • Required:
    • prompt: A descriptive text prompt that guides the image generation.
  • Optional:
    • mask: URI for the image mask (if applicable).
    • seed: Integer for random number generation.
    • image: URI of the input image for transformations.
    • model: Specify "dev" for quality or "schnell" for speed (default is "dev").
    • width & height: Custom dimensions for the output image.
    • loraScale: Main LoRA scale for strength application.
    • outputCount: Number of generated images (1-4).
    • outputQuality: JPEG quality setting (0-100).
    • imageAspectRatio: The aspect ratio for the image.
    • imageOutputFormat: Format for output images (webp, jpg, png).
    • Additional fields for advanced settings like enableFastMode, promptStrength, and disableSafetyChecker.

Example Input:

{
  "model": "dev",
  "prompt": "Aquí tienes un prompt que podrías usar para generar esa imagen:\n\n\"Un hombre de 45 años sin tatuajes en los brazos  con guantes negros y una maquina EZ  P3 Pro, trabajando en un estudio de tatuajes. Está concentrado mientras tatúa a una chica cubana, que tiene una expresión relajada. Ella tiene cabello rizado y lleva una blusa colorida. El ambiente del estudio es artístico, con pinturas en las paredes y herramientas de tatuaje alrededor. La luz es suave, creando una atmósfera acogedora.\" \n\nPuedes ajustarlo según tus preferencias. ¡Espero que te sirva!",
  "loraScale": 1,
  "outputCount": 1,
  "outputQuality": 90,
  "inferenceSteps": 28,
  "promptStrength": 0.8,
  "imageAspectRatio": "1:1",
  "imageOutputFormat": "webp",
  "additionalLoraScale": 1,
  "diffusionGuidanceScale": 3.5
}

Output

The action returns a URL to the generated image. For example:

[
  "https://assets.cognitiveactions.com/invocations/0908ae62-73bc-40c7-8270-fcb55ae0a65e/6594c1e8-565f-4b50-ac74-4cd25de0cabe.webp"
]

Conceptual Usage Example (Python):

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "45bc346c-4bb6-49c6-a146-d0cd5db32c9c" # Action ID for Generate Image with Inpainting

# Construct the input payload based on the action's requirements
payload = {
    "model": "dev",
    "prompt": "Aquí tienes un prompt que podrías usar para generar esa imagen:\n\n\"Un hombre de 45 años sin tatuajes en los brazos  con guantes negros y una maquina EZ  P3 Pro, trabajando en un estudio de tatuajes. Está concentrado mientras tatúa a una chica cubana, que tiene una expresión relajada. Ella tiene cabello rizado y lleva una blusa colorida. El ambiente del estudio es artístico, con pinturas en las paredes y herramientas de tatuaje alrededor. La luz es suave, creando una atmósfera acogedora.\" \n\nPuedes ajustarlo según tus preferencias. ¡Espero que te sirva!",
    "loraScale": 1,
    "outputCount": 1,
    "outputQuality": 90,
    "inferenceSteps": 28,
    "promptStrength": 0.8,
    "imageAspectRatio": "1:1",
    "imageOutputFormat": "webp",
    "additionalLoraScale": 1,
    "diffusionGuidanceScale": 3.5
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

This Python snippet demonstrates how to call the Cognitive Actions execution endpoint. You need to replace the API key and endpoint with your actual values. The payload is structured according to the action's requirements, ensuring you pass all necessary parameters.

Conclusion

The yasseryera/yasser Cognitive Actions provide a robust framework for generating images tailored to specific needs. With features like customizable prompts, multiple output formats, and the ability to tweak quality and speed, developers can enhance their applications significantly. Explore these actions further to unlock creative possibilities in your projects, and consider integrating them into your next application for dynamic image generation capabilities. Happy coding!