Enhance Your Applications with Image Generation Using Geethkalhara's Cognitive Actions

22 Apr 2025
Enhance Your Applications with Image Generation Using Geethkalhara's Cognitive Actions

In the digital age, the ability to create customized images on demand is a powerful tool for developers. The Geethkalhara/model2 spec offers a robust set of Cognitive Actions that enable users to generate images tailored to specific text prompts. This article will guide you through the capabilities of the Generate Custom Image action, demonstrating how to leverage its features to create stunning visuals for your applications.

Prerequisites

Before you start integrating the Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform to authenticate your requests.
  • Familiarity with making HTTP requests in your programming language of choice.

Authentication typically involves including the API key in the headers of your requests, which will allow you to access the action endpoints securely.

Cognitive Actions Overview

Generate Custom Image

The Generate Custom Image action creates a customized image based on a textual prompt. This action allows for various adjustments, including image quality, aspect ratio, and even inpainting capabilities. It's categorized under image-generation and is designed to cater to a wide range of creative needs.

Input

The input for this action is structured as an object and includes several required and optional fields. Below is the schema outline:

  • prompt (required): A detailed description that guides the image generation.
  • image (optional): A URI of an input image for inpainting or image-to-image conversion.
  • aspectRatio (optional): Defines the aspect ratio of the generated image.
  • width and height (optional): Specify the dimensions of the image if aspectRatio is set to custom.
  • guidanceScale (optional): A numeric scale influencing the generated image's adherence to the prompt.
  • outputQuality (optional): The quality level of the final output image (0 to 100).
  • numberOfOutputs (optional): The number of images to generate (1-4).
  • Additional optional parameters include seed, enableFastMode, inferenceModel, and several others for fine-tuning the image generation process.

Here’s a practical example of the JSON payload needed to invoke this action:

{
  "image": "https://replicate.delivery/pbxt/LyohAqfsB3Y2U1eFWmnWMQf1MqRl6xl1IiDT2MqAIx28CuNf/replicate-prediction-e3a433jy3xrme0ck6asv9ynyam.png",
  "prompt": "A close up realistic image of Geeth sitting in a cozy cafe near a window, gracefully holding a white coffee cup in one hand.",
  "aspectRatio": "1:1",
  "guidanceScale": 3.5,
  "outputQuality": 90,
  "numberOfOutputs": 1,
  "imageOutputFormat": "png",
  "numberOfInferenceSteps": 28
}

Output

The output is a URL pointing to the generated image. The example output could look like this:

[
  "https://assets.cognitiveactions.com/invocations/817b47ee-7785-44c0-8e16-7a3da5e64bff/38aa7531-8e0a-49b8-908c-73eedf6956bd.png"
]

Conceptual Usage Example (Python)

Here’s how you might call the Generate Custom Image action using a hypothetical Cognitive Actions execution endpoint:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "ab185d0f-4630-4623-b24d-d8abb43f0d6d"  # Action ID for Generate Custom Image

# Construct the input payload based on the action's requirements
payload = {
    "image": "https://replicate.delivery/pbxt/LyohAqfsB3Y2U1eFWmnWMQf1MqRl6xl1IiDT2MqAIx28CuNf/replicate-prediction-e3a433jy3xrme0ck6asv9ynyam.png",
    "prompt": "A close up realistic image of Geeth sitting in a cozy cafe near a window, gracefully holding a white coffee cup in one hand.",
    "aspectRatio": "1:1",
    "guidanceScale": 3.5,
    "outputQuality": 90,
    "numberOfOutputs": 1,
    "imageOutputFormat": "png",
    "numberOfInferenceSteps": 28
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, make sure to replace the placeholder for your API key and adjust the endpoint as necessary. The payload object is structured to match the required input for the Generate Custom Image action.

Conclusion

The Generate Custom Image action from the Geethkalhara/model2 spec provides developers with an easy way to create custom images from textual prompts. By leveraging its diverse input parameters, you can tailor images to meet specific aesthetic and functional requirements. As you integrate these capabilities into your applications, consider exploring different prompts and configurations to fully capitalize on the potential of image generation. Happy coding!