Enhance Your Applications with Image Generation Using erbridge/dbag Cognitive Actions

24 Apr 2025
Enhance Your Applications with Image Generation Using erbridge/dbag Cognitive Actions

In today's digital landscape, the ability to generate and manipulate images programmatically opens up a world of possibilities for developers. The erbridge/dbag Cognitive Actions provide a powerful way to create detailed images using advanced techniques like image inpainting. With flexible input options, including image masks, guidance scales, and various models, developers can craft unique visual content tailored to their needs. In this article, we'll explore the main features of the Generate Image with Inpainting action, its requirements, and how to integrate it into your applications.

Prerequisites

To use the Cognitive Actions, you will need:

  • An API key for accessing the Cognitive Actions platform.
  • Basic understanding of JSON payload structures and HTTP requests.

Authentication typically involves passing your API key in the headers of your requests.

Cognitive Actions Overview

Generate Image with Inpainting

The Generate Image with Inpainting action allows you to create detailed images by using existing images and masks as input. This action provides options for different models to balance quality and inference speed, making it suitable for various applications, from artistic rendering to practical use cases.

Input: This action requires a minimum of a text prompt to generate images, but it supports many optional parameters to customize the output.

Here’s a breakdown of the input schema:

  • prompt (required): Text prompt to guide image generation.
  • mask (optional): URI for the image mask used in inpainting mode.
  • image (optional): URI for the input image for image-to-image transformations.
  • model (optional): Specifies the model to use (dev or schnell).
  • width and height (optional): Dimensions of the generated image (applicable only if aspect_ratio is set to 'custom').
  • aspectRatio (optional): Aspect ratio for the generated image.
  • numOutputs (optional): Number of images to generate (1 to 4).
  • outputFormat (optional): Format of the output image (e.g., webp, jpg, png).
  • guidanceScale (optional): Controls the guidance scale for the diffusion process.
  • outputQuality (optional): Quality of the output image on a scale from 0 to 100.
  • numInferenceSteps (optional): Number of steps for denoising during image generation.

Example Input:

{
  "model": "dev",
  "prompt": "TOK, A vibrant watercolor painting of TOK lounging on a sun-drenched beach. Soft, translucent washes of turquoise and azure blend seamlessly for the ocean, while golden sand is rendered with loose, expressive brushstrokes. TOK's form is captured with fluid lines and gentle color gradients, surrounded by splashes of coral and lavender from nearby seashells. Wispy clouds drift across a warm, pastel sky, creating a dreamy atmosphere. The composition balances TOK's relaxed posture against the vast, serene seascape.",
  "numOutputs": 1,
  "aspectRatio": "1:1",
  "outputFormat": "webp",
  "guidanceScale": 7.5,
  "mainLoraScale": 1.2,
  "outputQuality": 80,
  "numInferenceSteps": 50
}

Output: The action typically returns a list of URLs pointing to the generated images. For example:

[
  "https://assets.cognitiveactions.com/invocations/08c711f4-e3b3-4d71-9c67-3d594a17e97c/ef714b8a-39ec-4777-9ad5-4e057c1375da.webp"
]

Conceptual Usage Example (Python): Here’s how you might call the Generate Image with Inpainting action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "3874ff4c-35c4-4978-bf35-3d680b87e1c5" # Action ID for Generate Image with Inpainting

# Construct the input payload based on the action's requirements
payload = {
    "model": "dev",
    "prompt": "TOK, A vibrant watercolor painting of TOK lounging on a sun-drenched beach. Soft, translucent washes of turquoise and azure blend seamlessly for the ocean, while golden sand is rendered with loose, expressive brushstrokes. TOK's form is captured with fluid lines and gentle color gradients, surrounded by splashes of coral and lavender from nearby seashells. Wispy clouds drift across a warm, pastel sky, creating a dreamy atmosphere. The composition balances TOK's relaxed posture against the vast, serene seascape.",
    "numOutputs": 1,
    "aspectRatio": "1:1",
    "outputFormat": "webp",
    "guidanceScale": 7.5,
    "mainLoraScale": 1.2,
    "outputQuality": 80,
    "numInferenceSteps": 50
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace the placeholder values with your actual API key and endpoint. The action ID corresponds to the "Generate Image with Inpainting" action. The input payload is structured according to the action's requirements.

Conclusion

The erbridge/dbag Cognitive Actions, particularly the Generate Image with Inpainting, offer developers a robust tool for creating and manipulating images programmatically. By utilizing the various parameters available, you can fine-tune your image generation process to suit your specific needs, whether for art, design, or other creative applications. As you explore these capabilities, consider how they might enhance your applications and workflows. Happy coding!