Create Stunning Images with Cognitive Actions from ad-ams/lanota-cap

24 Apr 2025
Create Stunning Images with Cognitive Actions from ad-ams/lanota-cap

In today's digital landscape, the ability to generate high-quality images programmatically can significantly enhance user experience and engagement in applications. The Cognitive Actions from the "ad-ams/lanota-cap" spec enable developers to create images with customizable attributes, including inpainting capabilities. With these pre-built actions, you can streamline the image generation process, allowing you to focus more on creativity and less on technical hurdles.

Prerequisites

Before you start using the Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Familiarity with making HTTP requests and handling JSON data.

Authentication typically involves passing your API key in the headers of your request, allowing you to securely access the cognitive services.

Cognitive Actions Overview

Generate Image with Image Inpainting

Description:
This action generates images with customizable attributes such as aspect ratio, size, and resolution while supporting image inpainting. It allows you to enhance image creation through detailed prompts and adjust quality settings for optimal output.

Category: Image Generation

Input

The input schema for this action includes several fields. The only required field is the prompt. Here’s a breakdown of the input schema:

  • prompt (string): Text prompt for generating the image. Including "trigger words" used during training increases the likelihood of activating the desired object or style.
  • mask (string, optional): URI of the image mask used for inpainting mode.
  • seed (integer, optional): Seed for random number generation to ensure reproducible results.
  • image (string, optional): URI of the input image for image-to-image or inpainting mode.
  • width (integer, optional): Width of the generated image in pixels (256 to 1440).
  • height (integer, optional): Height of the generated image in pixels (256 to 1440).
  • aspectRatio (string, optional): Specifies the aspect ratio for the generated image (e.g., "1:1", "16:9", "custom").
  • imageFormat (string, optional): File format for the output image (defaults to "webp").
  • outputCount (integer, optional): Total number of images to generate (1 to 4).
  • enableFastMode (boolean, optional): Enable fast generation mode.
  • inferenceModel (string, optional): Choose the model for image inference ("dev" or "schnell").
  • promptStrength (number, optional): Indicates prompt strength for img2img transformation.
  • imageMegapixels (string, optional): Desired number of megapixels for the generated image.
  • denoiseStepsCount (integer, optional): Number of steps for the denoising process.
  • imageGuidanceScale (number, optional): Scale for guiding the diffusion process.
  • outputImageQuality (integer, optional): Sets the output image quality.
  • additionalLoraScale (number, optional): Adjusts the influence of additional LoRA.
  • loraApplicationScale (number, optional): Adjusts the influence of the main LoRA.

Example Input:

{
  "width": 1024,
  "height": 1024,
  "prompt": "In a modern, sleek gym environment filled with state-of-the-art equipment...",
  "aspectRatio": "1:1",
  "imageFormat": "png",
  "outputCount": 2,
  "enableFastMode": false,
  "inferenceModel": "dev",
  "promptStrength": 0.8,
  "imageMegapixels": "1",
  "denoiseStepsCount": 40,
  "imageGuidanceScale": 3,
  "outputImageQuality": 80,
  "additionalLoraScale": 1,
  "loraApplicationScale": 1
}

Output

The action typically returns an array of image URLs corresponding to the generated images. Here’s an example output:

Example Output:

[
  "https://assets.cognitiveactions.com/invocations/87a87722-9372-4fa8-aef1-55981c540b6b/6c721163-d6da-437c-a540-d4138873a6b5.png",
  "https://assets.cognitiveactions.com/invocations/87a87722-9372-4fa8-aef1-55981c540b6b/0be02b01-1e02-4edb-a1c2-d9034d0f25ef.png"
]

Conceptual Usage Example (Python)

Here’s a conceptual Python code snippet demonstrating how to call this Cognitive Action:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "b5d5e914-5d34-4859-b4f5-ef449fae9f06" # Action ID for Generate Image with Image Inpainting

# Construct the input payload based on the action's requirements
payload = {
    "width": 1024,
    "height": 1024,
    "prompt": "In a modern, sleek gym environment filled with state-of-the-art equipment...",
    "aspectRatio": "1:1",
    "imageFormat": "png",
    "outputCount": 2,
    "enableFastMode": false,
    "inferenceModel": "dev",
    "promptStrength": 0.8,
    "imageMegapixels": "1",
    "denoiseStepsCount": 40,
    "imageGuidanceScale": 3,
    "outputImageQuality": 80,
    "additionalLoraScale": 1,
    "loraApplicationScale": 1
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

This code snippet illustrates how to structure the input JSON payload and send a request to the Cognitive Actions API. Make sure to replace the placeholders with your actual API key and action ID.

Conclusion

The "ad-ams/lanota-cap" Cognitive Actions empower developers to create visually appealing images tailored to specific needs, enhancing applications and user engagement. By utilizing these actions, you can produce high-quality images effortlessly. Explore different attributes, experiment with prompts, and start integrating these capabilities into your applications today!