Generate Stunning Images with the vcollos/nanda Cognitive Actions

23 Apr 2025
Generate Stunning Images with the vcollos/nanda Cognitive Actions

In today's digital landscape, the ability to generate and manipulate images programmatically has opened new avenues for creativity and innovation. The vcollos/nanda Cognitive Actions provide a powerful set of tools for image generation, allowing developers to harness advanced techniques such as image-to-image transformation and inpainting. This article will guide you through the capabilities of these actions, empowering you to integrate them into your own applications seamlessly.

Prerequisites

Before you start using the Cognitive Actions provided by the vcollos/nanda spec, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Familiarity with making HTTP requests and handling JSON data.

You will typically authenticate by passing your API key in the headers of your requests.

Cognitive Actions Overview

Generate Image with Inpainting

The Generate Image with Inpainting action allows you to create images using advanced techniques, offering detailed control over various parameters like width, height, image quality, and more. This action is particularly useful for scenarios requiring precise image manipulation and generation.

Input

The input for this action is structured as follows:

{
  "prompt": "portrait of Nanda sitted on the chair with sexy face",
  "goFast": false,
  "loraScale": 1,
  "modelType": "dev",
  "imageFormat": "webp",
  "outputCount": 1,
  "imageQuality": 80,
  "pixelDensity": "1",
  "guidanceScale": 3,
  "extraLoraScale": 1,
  "promptStrength": 0.8,
  "imageAspectRatio": "1:1",
  "numInferenceSteps": 28
}

Required Fields:

  • prompt: A text prompt guiding the image generation process.

Optional Fields:

  • mask: URI for the image mask for inpainting mode.
  • seed: Random seed for reproducibility.
  • image: URI of the input image for image-to-image mode.
  • width, height: Dimensions for the generated image if aspect_ratio is set to custom.
  • goFast: Enable a model optimized for speed.
  • modelType: Choose between 'dev' or 'schnell' models for inference.
  • imageFormat: The output format of the image (e.g., webp, jpg, png).
  • outputCount: Number of outputs to generate.
  • imageQuality: Quality level of the output image.
  • pixelDensity: Estimated megapixels for the generated image.
  • guidanceScale: Influences the detail level during generation.
  • extraLoraScale: Strength of additional LoRA effects.
  • promptStrength: Controls the influence of the prompt on the final image.
  • imageAspectRatio: Aspect ratio for the generated image.
  • numInferenceSteps: Determines how detailed the image will be.
  • safetyCheckerDisabled: Option to disable the safety checker.

Output

The output from this action typically consists of an array of image URLs, such as:

[
  "https://assets.cognitiveactions.com/invocations/fe25ebc5-7352-4483-93b0-690148b17d71/9ce20bba-20b1-4d7e-a221-92ecf906e918.webp"
]

This array contains the links to the generated images based on the input parameters provided.

Conceptual Usage Example (Python)

Here’s how you might use the Generate Image with Inpainting action in a Python application:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "538d8335-e8a1-4a96-b18c-939633f5fd6a" # Action ID for Generate Image with Inpainting

# Construct the input payload based on the action's requirements
payload = {
    "prompt": "portrait of Nanda sitted on the chair with sexy face",
    "goFast": False,
    "loraScale": 1,
    "modelType": "dev",
    "imageFormat": "webp",
    "outputCount": 1,
    "imageQuality": 80,
    "pixelDensity": "1",
    "guidanceScale": 3,
    "extraLoraScale": 1,
    "promptStrength": 0.8,
    "imageAspectRatio": "1:1",
    "numInferenceSteps": 28
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace the placeholder for your API key and adjust the endpoint as necessary. The payload variable contains the input structured according to the action's schema. The code demonstrates how to send a POST request to execute the action and handle the response.

Conclusion

The vcollos/nanda Cognitive Actions provide a robust framework for image generation, enabling developers to create and manipulate images with precision and creativity. By leveraging actions like Generate Image with Inpainting, you can enhance your applications with advanced image processing capabilities.

As you explore these actions, consider experimenting with different parameters to find the perfect settings for your needs. Happy coding!