Transforming Images with the asronline/btas-refined Cognitive Actions

23 Apr 2025
Transforming Images with the asronline/btas-refined Cognitive Actions

In the world of image generation, the asronline/btas-refined API offers powerful Cognitive Actions designed to enhance creativity and streamline image processing. This service allows developers to generate refined images from textual prompts, perform inpainting, and apply various transformation styles, enabling a wide range of visual applications. By leveraging these pre-built actions, you can significantly reduce development time and effort while enhancing your applications with advanced image generation capabilities.

Prerequisites

To get started with the Cognitive Actions, you will need an API key for the Cognitive Actions platform. This key will be used for authentication by including it in the request headers when making API calls. Ensure you have your API key handy as it will be essential for executing the actions described in this guide.

Cognitive Actions Overview

Generate Refined Images

The Generate Refined Images action is designed for creating high-quality images based on specified textual prompts, alongside input images and masks. This action supports various options for customization, including inpainting and different refinement styles.

Input

The input for this action follows a specific schema. Below is the necessary structure along with an example input:

{
  "width": 1024,
  "height": 1024,
  "prompt": "In the style of BTAS, a shadowy smile of a clown in a dark and mysterious hazy background",
  "refine": "no_refiner",
  "loraScale": 0.6,
  "scheduler": "K_EULER",
  "guidanceScale": 7.5,
  "applyWatermark": true,
  "negativePrompt": "",
  "promptStrength": 0.8,
  "numberOfOutputs": 1,
  "highNoiseFraction": 0.8,
  "numberOfInferenceSteps": 50
}

Input Fields:

  • width: (integer) The width of the output image in pixels. Default is 1024.
  • height: (integer) The height of the output image in pixels. Default is 1024.
  • prompt: (string) Describes the desired output; outlines the subject and surrounding context.
  • refine: (string) Selects the refinement style. Options include no_refiner, expert_ensemble_refiner, and base_image_refiner.
  • loraScale: (number) Adjusts the scale of the Low-Rank Adaptation method.
  • scheduler: (string) Determines the scheduling algorithm used during generation.
  • guidanceScale: (number) Scale for classifier-free guidance.
  • applyWatermark: (boolean) Indicates if a watermark is applied to the generated image.
  • negativePrompt: (string) Describes what should be excluded from the generated image.
  • promptStrength: (number) Indicates the strength of the prompt during img2img or inpaint operations.
  • numberOfOutputs: (integer) Specifies how many images will be generated (maximum of 4).
  • highNoiseFraction: (number) Determines the noise fraction during expert_ensemble_refiner operations.
  • numberOfInferenceSteps: (integer) Specifies the number of denoising steps during image generation.

Output

Upon execution, the action returns an array of URLs pointing to the generated images. Here’s an example of the output format:

[
  "https://assets.cognitiveactions.com/invocations/bdbc6a5b-2198-4a88-bcb5-faef0e7e2058/43aa54dc-e0bb-4858-a537-7cd2d063be6c.png"
]

Conceptual Usage Example (Python)

Here's how you can call the Generate Refined Images action using Python. This example demonstrates constructing the input JSON payload and making the API call.

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "c9cfd9b7-fc06-4e4f-b22d-052e487faff1" # Action ID for Generate Refined Images

# Construct the input payload based on the action's requirements
payload = {
    "width": 1024,
    "height": 1024,
    "prompt": "In the style of BTAS, a shadowy smile of a clown in a dark and mysterious hazy background",
    "refine": "no_refiner",
    "loraScale": 0.6,
    "scheduler": "K_EULER",
    "guidanceScale": 7.5,
    "applyWatermark": True,
    "negativePrompt": "",
    "promptStrength": 0.8,
    "numberOfOutputs": 1,
    "highNoiseFraction": 0.8,
    "numberOfInferenceSteps": 50
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this snippet:

  • The action_id is set to the ID of the Generate Refined Images action.
  • The payload dictionary is populated with the required inputs as described earlier.
  • The response is handled to check for success or errors, and the output is printed in a readable format.

Conclusion

The asronline/btas-refined Cognitive Actions provide developers with powerful tools for image generation and refinement. By integrating these actions into your applications, you can create stunning visuals that enhance user experiences. Whether you’re looking to automate content generation or experiment with creative image styles, these actions can serve as a significant asset in your development toolkit. Explore further and start integrating these capabilities today!