Enhance Your Applications with daoud23/pont Cognitive Actions: A Guide to Image Generation

21 Apr 2025
Enhance Your Applications with daoud23/pont Cognitive Actions: A Guide to Image Generation

In the world of AI and machine learning, the ability to generate and manipulate images dynamically can significantly enhance the functionality of applications. The daoud23/pont spec introduces powerful Cognitive Actions designed for image generation and inpainting, enabling developers to create and customize images using a variety of settings. These pre-built actions streamline the integration of image generation capabilities into your applications, offering a fast and efficient way to produce sophisticated visuals.

Prerequisites

To leverage the Cognitive Actions provided by the daoud23/pont spec, you'll need to set up access to the Cognitive Actions platform. Generally, this involves obtaining an API key, which you will pass in the headers of your requests to authenticate your API calls. Ensure that you have the necessary permissions and components in place to start making calls to the image generation actions.

Cognitive Actions Overview

Generate Image with Inpainting

Description:
This action allows you to generate and inpaint images by specifying prompts, aspect ratios, and other customizable settings. It supports image-to-image transformations and includes fast generation modes with optimized models like 'schnell' for quicker results with fewer inference steps.

Category: image-generation

Input

The input for this action is structured as follows:

{
  "prompt": "string (required)",
  "image": "string (optional, uri)",
  "mask": "string (optional, uri)",
  "model": "string (default: 'dev')",
  "width": "integer (optional)",
  "height": "integer (optional)",
  "goFast": "boolean (default: false)",
  "seed": "integer (optional)",
  "aspect_ratio": "string (default: '1:1')",
  "numberOfOutputs": "integer (default: 1)",
  "imageOutputFormat": "string (default: 'webp')",
  "imageOutputQuality": "integer (default: 80)",
  "numberOfInferenceSteps": "integer (default: 28)",
  "guidanceScale": "number (default: 3)",
  "promptStrength": "number (default: 0.8)",
  "loraScale": "number (default: 1)",
  "extraLoraScale": "number (default: 1)",
  "imageAspectRatio": "string (default: '1:1')",
  "isSafetyCheckerDisabled": "boolean (default: false)"
}

Example Input:

{
  "image": "https://replicate.delivery/pbxt/LePWi703Fx1ohDpfngpSwKobes3Kf2C79rFY2gXBO55ovw3V/pont-de-Cocody.webp",
  "model": "dev",
  "prompt": "a big pink ribbon tied on a bridge with a photo of pont alasanne",
  "loraScale": 1,
  "guidanceScale": 3.5,
  "extraLoraScale": 1,
  "promptStrength": 0.8,
  "numberOfOutputs": 1,
  "imageAspectRatio": "1:1",
  "imageOutputFormat": "png",
  "imageOutputQuality": 90,
  "numberOfInferenceSteps": 28
}

Output

The action typically returns a list of URLs pointing to the generated images. The output format is typically an array of strings, each representing the location of a generated image.

Example Output:

[
  "https://assets.cognitiveactions.com/invocations/6aefd9c9-ddd7-41b6-9a5e-bbabaeda7276/4641bee5-b341-4693-acb4-c48dcb7fd369.png"
]

Conceptual Usage Example (Python)

Here’s an illustrative Python snippet demonstrating how to invoke the "Generate Image with Inpainting" action via a hypothetical Cognitive Actions execution endpoint:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "1d197a2d-1989-4f9b-b96c-d37fc194d5ee"  # Action ID for Generate Image with Inpainting

# Construct the input payload based on the action's requirements
payload = {
    "image": "https://replicate.delivery/pbxt/LePWi703Fx1ohDpfngpSwKobes3Kf2C79rFY2gXBO55ovw3V/pont-de-Cocody.webp",
    "model": "dev",
    "prompt": "a big pink ribbon tied on a bridge with a photo of pont alasanne",
    "loraScale": 1,
    "guidanceScale": 3.5,
    "extraLoraScale": 1,
    "promptStrength": 0.8,
    "numberOfOutputs": 1,
    "imageAspectRatio": "1:1",
    "imageOutputFormat": "png",
    "imageOutputQuality": 90,
    "numberOfInferenceSteps": 28
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet:

  • Replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key.
  • The payload variable is structured to match the input requirements for the "Generate Image with Inpainting" action.
  • The response is parsed and printed, showing the URLs of the generated images.

Conclusion

The daoud23/pont Cognitive Actions provide an intuitive and versatile way to generate and manipulate images in your applications. By integrating these actions, developers can enhance user engagement and create visually appealing content efficiently. Consider exploring these actions further to discover innovative use cases that can elevate your projects. Happy coding!