Create Stunning Visuals with swk23/thrawn's Image Generation Actions

21 Apr 2025
Create Stunning Visuals with swk23/thrawn's Image Generation Actions

In the rapidly evolving world of AI, the ability to generate unique images from textual prompts is not only a fascinating technology but also a powerful tool for developers. The swk23/thrawn Cognitive Actions provide a robust API for generating images based on user-defined prompts, allowing for extensive customization and optimization. This post will explore how to integrate the Generate Images with Prompt action into your applications, unlocking a new realm of creative possibilities.

Prerequisites

Before diving into the integration, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Basic knowledge of making API calls and handling JSON data.
  • Python installed on your machine to run the example code.

Authentication typically involves passing your API key in the request headers, allowing secure access to the Cognitive Actions services.

Cognitive Actions Overview

Generate Images with Prompt

Description: This action generates images using textual prompts. Users can customize various aspects of the image, such as inpainting, aspect ratio, and model selection for optimal inference. The output is highly customizable, allowing for different formats and qualities.

  • Category: image-generation

Input

The input for this action adheres to the following schema:

  • prompt (required): Descriptive text to influence the image (e.g., "close up of Grand Admiral Thrawn stands tall...").
  • mask (optional): URI for the image mask used in inpainting mode.
  • image (optional): URI for an input image for image-to-image transformation.
  • width (optional): Custom width of the generated image (256-1440).
  • height (optional): Custom height of the generated image (256-1440).
  • megapixels (optional): Approximate number of megapixels for the output image.
  • aspectRatio (optional): Aspect ratio for the generated image (e.g., 21:9).
  • outputCount (optional): Number of images to generate (1-4).
  • outputFormat (optional): Image format (e.g., jpg, png).
  • guidanceScale (optional): Scale for the diffusion process (0-10).
  • outputQuality (optional): Quality of the saved outputs (0-100).
  • denoisingSteps (optional): Number of denoising steps (1-50).
  • inferenceModel (optional): Model used for inference (e.g., dev).
  • optimizeForSpeed (optional): Speed optimization settings.
  • primaryLoraScale (optional): Adjusts the influence of main LoRA weights.
  • additionalLoraScale (optional): Adjusts the influence of additional LoRA weights.
  • imageInfluenceStrength (optional): Strength of image influence in transformations.
  • deactivateSafetyChecker (optional): Toggles the safety checker.

Example Input:

{
  "mask": "https://replicate.delivery/pbxt/MSsWwCc2J9C9y1lJagB7emAf7SUZhq1sm0gox9Bmu87TGfRt/test1.png",
  "image": "https://replicate.delivery/pbxt/MSsWw1V5lhrBELDWGjE3yj5FMxXWynrP5wluUxurtBu3kpOw/test.png",
  "prompt": "\"close up of Grand Admiral Thrawn stands tall in a cold, windowless, metallic Imperial meeting room, His pristine white uniform contrasts against the dark surroundings, his posture exuding quiet confidence and meticulous calculation.\"",
  "megapixels": "1",
  "aspectRatio": "21:9",
  "outputCount": 1,
  "outputFormat": "jpg",
  "guidanceScale": 3,
  "outputQuality": 80,
  "denoisingSteps": 28,
  "inferenceModel": "dev",
  "optimizeForSpeed": false,
  "primaryLoraScale": 1,
  "additionalLoraScale": 1,
  "imageInfluenceStrength": 0.8
}

Output

The action typically returns an array of image URLs generated based on the input prompt.

Example Output:

[
  "https://assets.cognitiveactions.com/invocations/e6c546e1-f074-489a-bfad-ab7ff3f53fa3/91bdbf6c-300a-4eee-9387-ad9acb4fd3c6.jpg"
]

Conceptual Usage Example (Python)

Here’s how you might implement a call to the Generate Images with Prompt action in Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "9c6510c2-dab7-41bf-8fb0-4ca77bc5b731" # Action ID for Generate Images with Prompt

# Construct the input payload based on the action's requirements
payload = {
    "mask": "https://replicate.delivery/pbxt/MSsWwCc2J9C9y1lJagB7emAf7SUZhq1sm0gox9Bmu87TGfRt/test1.png",
    "image": "https://replicate.delivery/pbxt/MSsWw1V5lhrBELDWGjE3yj5FMxXWynrP5wluUxurtBu3kpOw/test.png",
    "prompt": "\"close up of Grand Admiral Thrawn stands tall in a cold, windowless, metallic Imperial meeting room, His pristine white uniform contrasts against the dark surroundings, his posture exuding quiet confidence and meticulous calculation.\"",
    "megapixels": "1",
    "aspectRatio": "21:9",
    "outputCount": 1,
    "outputFormat": "jpg",
    "guidanceScale": 3,
    "outputQuality": 80,
    "denoisingSteps": 28,
    "inferenceModel": "dev",
    "optimizeForSpeed": False,
    "primaryLoraScale": 1,
    "additionalLoraScale": 1,
    "imageInfluenceStrength": 0.8
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this example, remember to replace the placeholder for the API key and the hypothetical endpoint. The payload is constructed directly from the input schema, ensuring that it adheres to the required structure.

Conclusion

The Generate Images with Prompt action from the swk23/thrawn Cognitive Actions allows developers to tap into the creative potential of AI-driven image generation. With its extensive customization options, you can easily integrate this action into your applications, whether for creating art, enhancing user experiences, or developing unique content. Next, consider experimenting with different prompts and settings to fully explore the capabilities of this powerful tool!