Transforming Images with the asronline/btas-refined Cognitive Actions

In the world of image generation, the asronline/btas-refined API offers powerful Cognitive Actions designed to enhance creativity and streamline image processing. This service allows developers to generate refined images from textual prompts, perform inpainting, and apply various transformation styles, enabling a wide range of visual applications. By leveraging these pre-built actions, you can significantly reduce development time and effort while enhancing your applications with advanced image generation capabilities.
Prerequisites
To get started with the Cognitive Actions, you will need an API key for the Cognitive Actions platform. This key will be used for authentication by including it in the request headers when making API calls. Ensure you have your API key handy as it will be essential for executing the actions described in this guide.
Cognitive Actions Overview
Generate Refined Images
The Generate Refined Images action is designed for creating high-quality images based on specified textual prompts, alongside input images and masks. This action supports various options for customization, including inpainting and different refinement styles.
Input
The input for this action follows a specific schema. Below is the necessary structure along with an example input:
{
"width": 1024,
"height": 1024,
"prompt": "In the style of BTAS, a shadowy smile of a clown in a dark and mysterious hazy background",
"refine": "no_refiner",
"loraScale": 0.6,
"scheduler": "K_EULER",
"guidanceScale": 7.5,
"applyWatermark": true,
"negativePrompt": "",
"promptStrength": 0.8,
"numberOfOutputs": 1,
"highNoiseFraction": 0.8,
"numberOfInferenceSteps": 50
}
Input Fields:
- width: (integer) The width of the output image in pixels. Default is 1024.
- height: (integer) The height of the output image in pixels. Default is 1024.
- prompt: (string) Describes the desired output; outlines the subject and surrounding context.
- refine: (string) Selects the refinement style. Options include
no_refiner,expert_ensemble_refiner, andbase_image_refiner. - loraScale: (number) Adjusts the scale of the Low-Rank Adaptation method.
- scheduler: (string) Determines the scheduling algorithm used during generation.
- guidanceScale: (number) Scale for classifier-free guidance.
- applyWatermark: (boolean) Indicates if a watermark is applied to the generated image.
- negativePrompt: (string) Describes what should be excluded from the generated image.
- promptStrength: (number) Indicates the strength of the prompt during img2img or inpaint operations.
- numberOfOutputs: (integer) Specifies how many images will be generated (maximum of 4).
- highNoiseFraction: (number) Determines the noise fraction during expert_ensemble_refiner operations.
- numberOfInferenceSteps: (integer) Specifies the number of denoising steps during image generation.
Output
Upon execution, the action returns an array of URLs pointing to the generated images. Here’s an example of the output format:
[
"https://assets.cognitiveactions.com/invocations/bdbc6a5b-2198-4a88-bcb5-faef0e7e2058/43aa54dc-e0bb-4858-a537-7cd2d063be6c.png"
]
Conceptual Usage Example (Python)
Here's how you can call the Generate Refined Images action using Python. This example demonstrates constructing the input JSON payload and making the API call.
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "c9cfd9b7-fc06-4e4f-b22d-052e487faff1" # Action ID for Generate Refined Images
# Construct the input payload based on the action's requirements
payload = {
"width": 1024,
"height": 1024,
"prompt": "In the style of BTAS, a shadowy smile of a clown in a dark and mysterious hazy background",
"refine": "no_refiner",
"loraScale": 0.6,
"scheduler": "K_EULER",
"guidanceScale": 7.5,
"applyWatermark": True,
"negativePrompt": "",
"promptStrength": 0.8,
"numberOfOutputs": 1,
"highNoiseFraction": 0.8,
"numberOfInferenceSteps": 50
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this snippet:
- The
action_idis set to the ID of the Generate Refined Images action. - The
payloaddictionary is populated with the required inputs as described earlier. - The response is handled to check for success or errors, and the output is printed in a readable format.
Conclusion
The asronline/btas-refined Cognitive Actions provide developers with powerful tools for image generation and refinement. By integrating these actions into your applications, you can create stunning visuals that enhance user experiences. Whether you’re looking to automate content generation or experiment with creative image styles, these actions can serve as a significant asset in your development toolkit. Explore further and start integrating these capabilities today!