Enhance Your Applications with Image Generation Using Cognitive Actions

22 Apr 2025
Enhance Your Applications with Image Generation Using Cognitive Actions

In the rapidly advancing field of artificial intelligence, image generation has become an exciting frontier, enabling developers to create unique visual content effortlessly. The sambowenhughes/testing-sdx-demo-v2 API provides a powerful Cognitive Action that allows developers to generate enhanced images through sophisticated techniques such as img2img and inpainting. This article will guide you through the capabilities of the Generate Enhanced Images action, its usage, and how to integrate it into your applications.

Prerequisites

Before you start using the Cognitive Actions, ensure that you have:

  • An API key for the Cognitive Actions platform.
  • Familiarity with making HTTP requests and handling JSON data.

Authentication is typically done by passing the API key in the headers of your requests, allowing secure access to the Cognitive Actions.

Cognitive Actions Overview

Generate Enhanced Images

The Generate Enhanced Images action enables developers to create tailored images by providing an input mask and various customizable prompt settings. With options such as refinement mode, scheduler type, and guidance scale, this action empowers users to define the creative direction of their image generation.

Input

The input schema for this action is a composite request consisting of the following fields:

  • mask (string, optional): URI pointing to an input mask for inpaint mode. Black areas will be preserved, while white areas will be inpainted.
  • seed (integer, optional): Integer seed for random number generation. Leave blank to randomize the seed.
  • image (string, optional): URI pointing to an input image for img2img or inpaint mode.
  • width (integer, optional, default: 1024): Width of the output image in pixels.
  • height (integer, optional, default: 1024): Height of the output image in pixels.
  • prompt (string, required): Text prompt that guides image generation.
  • refine (string, optional, default: "no_refiner"): Select the refinement style to apply.
  • loraScale (number, optional, default: 0.6): LoRA additive scale.
  • scheduler (string, optional, default: "K_EULER"): Type of scheduler during image generation.
  • guidanceScale (number, optional, default: 7.5): Factor for classifier-free guidance.
  • applyWatermark (boolean, optional, default: true): Apply a watermark to the generated image.
  • negativePrompt (string, optional): Text prompt indicating undesired features.
  • promptStrength (number, optional, default: 0.8): Strength of the prompt in img2img/inpaint modes.
  • numberOfOutputs (integer, optional, default: 1): Quantity of images to generate (1-4).
  • refinementSteps (integer, optional): Number of refinement steps for the base image refiner.
  • highNoiseFraction (number, optional, default: 0.8): Fraction of noise for the expert ensemble refiner.
  • disableSafetyChecker (boolean, optional, default: false): Flag to disable the safety checker.
  • numberOfInferenceSteps (integer, optional, default: 50): Number of denoising steps during image generation.

Example Input:

{
  "width": 1024,
  "height": 1024,
  "prompt": "Create me Kanye west styled lego sets",
  "refine": "no_refiner",
  "loraScale": 0.6,
  "scheduler": "K_EULER",
  "guidanceScale": 7.5,
  "applyWatermark": true,
  "negativePrompt": "",
  "promptStrength": 0.8,
  "numberOfOutputs": 4,
  "highNoiseFraction": 0.8,
  "numberOfInferenceSteps": 50
}

Output

The action returns a list of URLs pointing to the generated images, providing easy access to the resulting visuals.

Example Output:

[
  "https://assets.cognitiveactions.com/invocations/986acef3-e648-4737-9160-7212f49a07c7/212a5406-75f5-4b07-b182-9555707e8e21.png",
  "https://assets.cognitiveactions.com/invocations/986acef3-e648-4737-9160-7212f49a07c7/897422ab-41a1-414e-abea-2c306373ebfe.png",
  "https://assets.cognitiveactions.com/invocations/986acef3-e648-4737-9160-7212f49a07c7/14084f65-090f-4519-ae77-dfec8da202bf.png",
  "https://assets.cognitiveactions.com/invocations/986acef3-e648-4737-9160-7212f49a07c7/87f3ebe7-91b8-4379-a60a-3b8b8be4141e.png"
]

Conceptual Usage Example (Python)

Here's how you might call the Generate Enhanced Images action using a hypothetical endpoint:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "1f4b8ae2-c4f0-40d9-939a-4f02fefbf75b" # Action ID for Generate Enhanced Images

# Construct the input payload based on the action's requirements
payload = {
    "width": 1024,
    "height": 1024,
    "prompt": "Create me Kanye west styled lego sets",
    "refine": "no_refiner",
    "loraScale": 0.6,
    "scheduler": "K_EULER",
    "guidanceScale": 7.5,
    "applyWatermark": True,
    "negativePrompt": "",
    "promptStrength": 0.8,
    "numberOfOutputs": 4,
    "highNoiseFraction": 0.8,
    "numberOfInferenceSteps": 50
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this Python snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The action ID corresponds to the Generate Enhanced Images action, and the payload is structured according to the input schema defined earlier.

Conclusion

The Generate Enhanced Images Cognitive Action provides a powerful tool for developers looking to generate custom images through AI-driven techniques. By leveraging customizable inputs, you can create unique visuals that meet your application's needs. Explore how to integrate this action into your project for innovative image generation possibilities. Whether you're enhancing existing graphics or creating entirely new visual content, the Cognitive Actions API can help you achieve your creative goals.