Supercharge Your Applications with katarinadoros/kattis Image Generation Actions

In today's digital landscape, generating captivating images based on text prompts has become increasingly valuable for developers looking to enhance their applications. The katarinadoros/kattis spec provides a robust set of Cognitive Actions designed for image generation, particularly through the action "Generate Image with Mask." This action allows developers to create images using a variety of parameters, including image masks, model types, and aspect ratios. By leveraging these pre-built actions, developers can save time and resources while delivering high-quality visual content.
Prerequisites
Before diving into the integration of the Cognitive Actions, ensure you have the following:
- API Key: You'll need an API key for the Cognitive Actions platform.
- Basic Setup: Familiarity with making HTTP requests and handling JSON data.
Authentication generally works by passing your API key in the headers of your requests, ensuring secure access to the Cognitive Actions.
Cognitive Actions Overview
Generate Image with Mask
The Generate Image with Mask action creates images based on a given prompt and a variety of customizable parameters. This action supports both image-to-image and inpainting modes, allowing for enhanced quality and speed in image generation.
- Category: Image Generation
Input
The following schema outlines the required and optional fields for this action:
{
"prompt": "string (required)",
"mask": "string (optional, uri)",
"seed": "integer (optional)",
"image": "string (optional, uri)",
"model": "string (default: 'dev', optional)",
"width": "integer (optional)",
"height": "integer (optional)",
"imageFormat": "string (default: 'webp', optional)",
"outputCount": "integer (default: 1, optional)",
"imageQuality": "integer (default: 80, optional)",
"loraIntensity": "number (default: 1, optional)",
"additionalLora": "string (optional)",
"enableQuickMode": "boolean (default: false, optional)",
"imageResolution": "string (default: '1', optional)",
"promptIntensity": "number (default: 0.8, optional)",
"imageAspectRatio": "string (default: '1:1', optional)",
"guidanceIntensity": "number (default: 3, optional)",
"inferenceStepCount": "integer (default: 28, optional)",
"safetyCheckerDisabled": "boolean (default: false, optional)",
"additionalLoraIntensity": "number (default: 1, optional)"
}
Example Input:
{
"model": "dev",
"prompt": "Portrait of a professional 23 year old girl Kattis, sitting at a desk in a modern, bright office environment. She has a confident and friendly expression, dressed in business-casual attire with a simple yet stylish dark blue blazer. In the background, a blurred view of a computer, plants, and office decor creates a professional but relaxed atmosphere. The lighting is natural and soft, highlighting her face and smile without showing the teeth, focusing on conveying a positive and competent demeanor. She has wavy brown hair with curtain bangs that falls on her shoulders. She wears sophisticated gold jewelry.",
"imageFormat": "webp",
"outputCount": 1,
"imageQuality": 90,
"loraIntensity": 1,
"promptIntensity": 0.8,
"imageAspectRatio": "1:1",
"guidanceIntensity": 3.5,
"inferenceStepCount": 28,
"additionalLoraIntensity": 1
}
Output
The action typically returns an array of URLs pointing to the generated images.
Example Output:
[
"https://assets.cognitiveactions.com/invocations/4a686ea3-a304-49a9-9110-564ad3c7289f/c9ac6a95-a206-45b5-be3c-6f041ac8a40a.webp"
]
Conceptual Usage Example (Python)
Here’s a conceptual example demonstrating how a developer might call the Generate Image with Mask action using Python:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "dea46060-2664-4ddf-933f-4d125b589ceb" # Action ID for Generate Image with Mask
# Construct the input payload based on the action's requirements
payload = {
"model": "dev",
"prompt": "Portrait of a professional 23 year old girl Kattis, sitting at a desk in a modern, bright office environment. She has a confident and friendly expression, dressed in business-casual attire with a simple yet stylish dark blue blazer. In the background, a blurred view of a computer, plants, and office decor creates a professional but relaxed atmosphere. The lighting is natural and soft, highlighting her face and smile without showing the teeth, focusing on conveying a positive and competent demeanor. She has wavy brown hair with curtain bangs that falls on her shoulders. She wears sophisticated gold jewelry.",
"imageFormat": "webp",
"outputCount": 1,
"imageQuality": 90,
"loraIntensity": 1,
"promptIntensity": 0.8,
"imageAspectRatio": "1:1",
"guidanceIntensity": 3.5,
"inferenceStepCount": 28,
"additionalLoraIntensity": 1
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this Python code snippet, replace "YOUR_COGNITIVE_ACTIONS_API_KEY" with your actual API key. The action_id corresponds to the Generate Image with Mask action, and the payload is constructed based on the example input requirements.
Conclusion
The katarinadoros/kattis Cognitive Actions provide developers with powerful tools for generating images based on detailed prompts. By utilizing the Generate Image with Mask action, developers can enhance their applications with customized visual content, saving time and improving user experience. Consider exploring other use cases or integrating additional features for even more creative possibilities!