Unlocking Creative Potential: Image Generation with doobls-ai Cognitive Actions

22 Apr 2025
Unlocking Creative Potential: Image Generation with doobls-ai Cognitive Actions

In the world of AI and machine learning, image generation has taken a significant leap forward, particularly with the advent of methods like inpainting. The doobls-ai/golliersrasse-selfcaptioned spec introduces a powerful set of Cognitive Actions designed to generate images through inpainting techniques. This article will guide developers on how to integrate these actions into their applications, highlighting their capabilities and providing practical examples.

Prerequisites

Before you begin integrating these Cognitive Actions, ensure you have the following:

  • API Key: You will need an API key to authenticate your requests to the Cognitive Actions platform.
  • Development Environment: A working environment set up for making HTTP requests (e.g., Python with requests library).

Authentication typically involves passing the API key in the request headers, allowing you to securely access the service’s features.

Cognitive Actions Overview

Generate Image Using Inpainting

Description: This action allows you to generate images by performing inpainting using a specified image mask and AI model. You can choose between 'dev' and 'schnell' for inference, supporting fast image generation and various output configurations.

Category: image-generation

Input:

The action requires a JSON payload that includes various fields. Below is a breakdown of the required and optional fields based on the provided schema:

  • Required Fields:
    • prompt: (string) The text guiding the image generation.
  • Optional Fields:
    • mask: (string) URI of the image mask used in inpainting mode.
    • seed: (integer) Random seed for reproducibility.
    • image: (string) URI for the input image.
    • width: (integer) Width of the generated image (valid only when aspect_ratio is 'custom').
    • height: (integer) Height of the generated image (valid only when aspect_ratio is 'custom').
    • loraIntensity: (number) Strength of the main LoRA application.
    • enableFastMode: (boolean) Toggle for faster predictions.
    • inferenceModel: (string) Select between 'dev' or 'schnell'.
    • imageMegapixels: (string) Approximate number of megapixels.
    • numberOfOutputs: (integer) Specify the number of images to generate (1 to 4).
    • imageAspectRatio: (string) Aspect ratio for the generated image.
    • customLoraWeights: (string) Load custom LoRA weights.
    • imageOutputFormat: (string) Specify output image format (webp, jpg, png).
    • imageOutputQuality: (integer) Quality of the output images (0 to 100).
    • imagePromptStrength: (number) Strength of the prompt in image generation.
    • inferenceStepsCount: (integer) Steps for denoising (1 to 50).
    • safetyCheckerDisabled: (boolean) Enable or disable the safety checker.

Example Input:

{
  "prompt": "golliersrasse Imagine a modern office room designed for productivity and comfort. A central desk faces the entrance...",
  "loraIntensity": 1,
  "enableFastMode": false,
  "inferenceModel": "dev",
  "imageMegapixels": "1",
  "numberOfOutputs": 3,
  "imageAspectRatio": "1:1",
  "imageOutputFormat": "webp",
  "imageOutputQuality": 80,
  "imagePromptStrength": 0.8,
  "inferenceStepsCount": 28,
  "diffusionGuidanceScale": 3,
  "additionalLoraIntensity": 1
}

Output:

The action returns a list of URIs pointing to the generated images. Below is a sample output:

[
  "https://assets.cognitiveactions.com/invocations/d355b79a-95c8-45c8-a31a-9bb3a10a5c9a/6de6ecd3-2ef4-4784-a2cb-8b7d1a7d08da.webp",
  "https://assets.cognitiveactions.com/invocations/d355b79a-95c8-45c8-a31a-9bb3a10a5c9a/5d5536fd-d18e-4aac-949c-00920ea39455.webp",
  "https://assets.cognitiveactions.com/invocations/d355b79a-95c8-45c8-a31a-9bb3a10a5c9a/b054f404-fce9-45ae-bf88-66c8abe5a71d.webp"
]

Conceptual Usage Example (Python):

Here’s how you might call the Cognitive Actions execution endpoint using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "091240ce-81a7-42bd-8b79-9e83f25add32"  # Action ID for Generate Image Using Inpainting

# Construct the input payload based on the action's requirements
payload = {
    "prompt": "golliersrasse Imagine a modern office room designed for productivity and comfort. A central desk faces the entrance...",
    "loraIntensity": 1,
    "enableFastMode": False,
    "inferenceModel": "dev",
    "imageMegapixels": "1",
    "numberOfOutputs": 3,
    "imageAspectRatio": "1:1",
    "imageOutputFormat": "webp",
    "imageOutputQuality": 80,
    "imagePromptStrength": 0.8,
    "inferenceStepsCount": 28,
    "diffusionGuidanceScale": 3,
    "additionalLoraIntensity": 1
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The payload variable is constructed to match the input schema for the action. Ensure that the endpoint URL and request structure are illustrative and tailored for the Cognitive Actions system.

Conclusion

The doobls-ai/golliersrasse-selfcaptioned Cognitive Actions provide developers with robust tools for generating images through inpainting. By leveraging these pre-built actions, you can quickly integrate advanced image generation capabilities into your applications. Consider exploring additional use cases, such as art generation, design mockups, or content for marketing materials. Happy coding!